Optimal quantum networks and one-shot entropies

Giulio Chiribella; Daniel Ebler

doi:10.1088/1367-2630/18/9/093053

1. Introduction

Advances in quantum communication [1–3] and in the integration of quantum hardware [4–8] are pushing towards the realization of networked quantum information systems, such as quantum communication networks [9–13] and distributed quantum computing [14–16]. Networks of interacting quantum devices are attracting interest also at the theoretical level, providing a framework for quantum games [17] and protocols [18–20], insights on the foundations of quantum mechanics [18, 21–23], a starting point for a general theory of Bayesian inference [24–31] and for the development of models of higher-order quantum computation [32–34].

The network scenario motivates a new set of optimization problems, where the goal is not to optimize individual devices, but rather to optimize how different devices interact with one another. In many situations, the devices operate in a well-defined causal order—this is the case, for example, in the circuit model of quantum computing, where computations are implemented by sequences of gates [35, 36]. Recently, researchers have started to investigate more general situations, where the causal order can be in a quantum superposition [20, 33, 34, 37–39] or can be indefinite in other more exotic ways, in principle compatible with quantum mechanics [34, 40–45]. In these new situations, optimizing quantum networks is important, for at least three reasons: First, in order to establish an advantage, one has to first find the optimal performances achievable in a definite causal order. Second, finding the maximum advantage requires an optimization over all non-causal networks. This is an essential step for assessing the power of the new, non-causal models of information processing. Third, identifying the ultimate performances achieved in the absence of pre-defined causal structure is expected to shed light on the interplay between quantum mechanics and spacetime.

In this paper we develop a semidefinite programming (SDP) approach to the optimization of quantum networks. We start by analyzing scenarios with definite causal order, choosing an operational measure of performance, quantified by how much the network scores in a given test. The test consists in sending inputs to the devices, performing local computations, and finally measuring the outputs. Tests of this type are also important in the theory of quantum interactive proof systems [46], wherein they are used to model the interaction between a prover and a verifier. The input-output behavior of a quantum causal network is described in the framework of quantum combs [18, 47] (also known as quantum strategies [17]), which associates a positive operator to any given sequence of quantum operations. In this framework, the optimization is a SDP. We work out the dual optimization problem, showing that the maximum score is quantified by a one-shot entropic quantity that characterizes the informativeness of the test. This quantity extends to networks the notion of max relative entropy [48–51] (see also the monograph [52]). Building on the connection with the max relative entropy, we define a measure of the amount of correlations that a causal network can generate over time. This quantity is based on the notion of conditional min-entropy [50, 51], originally defined for quantum states and extended here to quantum causal networks.

After discussing the causal case, we turn our attention to quantum networks with indefinite causal order. Some of these networks arise when multiple quantum devices are connected in a way that is controlled by the state of a quantum system [18, 20, 38]. Some other networks are not built by linking up individual devices [41]. They are 'networks' in a generalized sense: they are spatially distributed objects that can interact with a set of local devices. The description of these generalized networks is trickier, because we cannot specify their behavior in terms of the behavior of individual quantum devices. Instead, we must characterize them through the way they respond to external inputs. More specifically, a general quantum network is specified by a map that accepts as input the operations taking place in local laboratories and returns as output an operation, as figure 1. Maps that transform quantum operations are known as quantum supermaps. They were originally introduced in the causal scenario [18, 53] and later generalized to the case of networks with indefinite causal structure [18, 34, 41]. These maps can be represented by positive operators, subject to a set of constraints that guarantee that valid operations are transformed into valid operations. Again, the form of these constraints leads to SDPs. In this case, we find that the maximum score can be expressed in terms of a max relative entropy, here named the max relative entropy of signaling, which quantifies the deviation from the set of no-signaling channels. In addition, we characterize the max relative entropy between two non-causal network, showing that it is equal to the maximum of the max relative entropy over all the states that can be generated by interacting with the two networks. This result opens the way to the definition of hypothesis testing protocols to probe the fundamental structure of spacetime, by testing the possibility of exotic non-causal networks against the null hypothesis that events have a well-defined causal structure.

**Figure 1.** Generalized network (in blue) interacting with two sequences of local devices in Alice's laboratory (orange boxes) and in Bob's laboratory (green boxes). Devices acting in the same laboratory are applied in a well-defined causal order, corresponding to the direction from left to right in the picture. However, no causal order is assumed between the devices in the two laboratories.
Download figure:
Standard image High-resolution image

To illustrate the general method, we provide a number of applications to concrete tasks, involving the optimization of both causal and non-causal networks. For the optimization in the causal setting, we consider the tasks of inverting an unknown unitary dynamics, simulating the evolution of a charge conjugate particle, and adding control to an unknown unitary gate. Looking at these tasks in terms of network optimization is a relatively new approach and here we provide the first optimized solutions. For the optimization in the non-causal setting, we illustrate our method by analyzing the non-causal game introduced by Oreshkov et al [41]. In this case, we fix the operations performed by the players (as in [41]) and we search for the non-causal network that offers the largest advantage for these operations. Using the SDP approach, we obtain a simple proof of the optimality of the network presented in [41]. Optimality can also be derived from a recent result of Brukner [54], who considered a more general scenario where the players' operations are not fixed, but rather subject to optimization. When the operations are fixed as in [41], however, our SDP technique yields a significantly shorter optimality proof. The simplification in this restricted scenario suggests that SDP may prove useful also for the broader scope of identifying a non-causal analogue of the Tsirelson bound, which was the motivating problem of [54].

The paper is organized as follows. In section 2 we introduce the framework of quantum combs and the characterization of quantum causal networks. In section 3 we review the basic facts about SDP and establish a general relation with the max relative entropy. The general result is applied to quantum causal networks in section 4 and is then used to define a suitable extension of the conditional min-entropy (section 5) and of the max relative entropy (section 6). In sections 7 and 8 we extend the results to quantum networks without predefined causal structure. Our techniques are illustrated in section 9, where we present applications to the tasks of inverting unknown evolutions, simulating charge conjugation, controlling unitary gates, and maximizing the winning probability in a non-causal quantum game. Finally, the conclusions are drawn in section 10.

2. The framework of quantum combs

In this section we introduce the concepts required for the optimization of quantum causal networks. First of all, we review the connection between quantum channels and operators. Then, we present the basics of the framework of quantum combs.

2.1. Quantum operations, quantum channels, and the Choi isomorphism

Quantum operations [55] describe the most general transformations of quantum systems, including both the reversible transformations associated to unitary gates and the irreversible transformations due to measurements. A quantum operation with input system A and output system B is a completely positive trace non-increasing map ${ \mathcal C }$ , transforming operators on the input Hilbert space ${{ \mathcal H }}_{A}$ into operators on the output Hilbert space ${{ \mathcal H }}_{B}$ . We will often use the diagrammatic notation

We say that the quantum operation ${ \mathcal C }$ in the above diagram is of type $A\to B$ .

When system A is trivial—that is, when its Hilbert space is one-dimensional—the quantum operation ${ \mathcal C }$ corresponds to the preparation of a state of system B, diagrammatically represented as When system B is trivial, the quantum operation ${ \mathcal C }$ in equation (1) corresponds to a measurement effect on system A and is represented as Measurement effects are positive (semidefinite) operators P satisfying $P\leqslant I$ , where I is the identity operator on the system's Hilbert space. Effects are associated to the outcomes of measurements and the probability of the outcome corresponding to the effect P is given by the Born rule

where ρ is the state of the system before the measurement. In the special case where P is the identity operator, we represent the corresponding effect as

In general, quantum measurement processes are described by quantum instruments. A quantum instrument with input A and output B is a collection of quantum operations ${\{{{ \mathcal C }}_{x}\}}_{x\in {\mathsf{X}}}$ of type $A\to B$ , subject to the condition that the sum ${\sum }_{x\in {\mathsf{X}}}{{ \mathcal C }}_{x}$ is trace-preserving. Each quantum operation corresponds to one alternative outcome x and the probability that the quantum operation ${{ \mathcal C }}_{x}$ takes places on a given input state ρ is given by

When the instrument ${\{{{ \mathcal C }}_{x}\}}_{x\in {\mathsf{X}}}$ has a single outcome, say x₀, the corresponding process is deterministic, meaning that one can predict in advance that the outcome will be x₀. In this case, the quantum operation ${{ \mathcal C }}_{{x}_{0}}$ is trace preserving. Trace preserving quantum operations are also known as quantum channels.

Completely positive maps can be represented by positive operators. Let ${\mathsf{Lin}}({ \mathcal H })$ be the space of linear operators on the Hilbert space ${ \mathcal H }$ and let ${ \mathcal C }$ be a completely positive map transforming operators in ${\mathsf{Lin}}({{ \mathcal H }}_{0})$ into operators on ${\mathsf{Lin}}({{ \mathcal H }}_{1})$ . Then, the map ${ \mathcal C }$ can be represented by a positive operator $C\in {\mathsf{Lin}}({{ \mathcal H }}_{1}\otimes {{ \mathcal H }}_{0})$ , defined as

$\begin{eqnarray}&&C=({ \mathcal C }\otimes {{ \mathcal I }}_{0})(\,| I \rangle\rangle \langle\langle I| ),\end{eqnarray} \tag{ 4 }$

where ${{ \mathcal I }}_{0}$ denotes the identity map on ${\mathsf{Lin}}({{ \mathcal H }}_{0})$ and $| I \rangle\rangle$ is the unnormalized maximally entangled state $| I \rangle\rangle ={\sum }_{i}| i\rangle | i\rangle \in {{ \mathcal H }}_{0}\otimes {{ \mathcal H }}_{0}$ . The operator C is known as the Choi operator [56].

Quantum operations and quantum channels can be characterized in terms of their Choi operators: a positive operator $Q\in {\mathsf{Lin}}({{ \mathcal H }}_{1}\otimes {{ \mathcal H }}_{0})$ corresponds to a quantum operation if and only if it satisfies the condition

$\begin{eqnarray}&&{\Tr }_{1}[Q]\leqslant {I}_{0},\end{eqnarray} \tag{ 5 }$

where ${\Tr }_{1}$ denotes the partial trace over the Hilbert space ${{ \mathcal H }}_{1}$ , I₀ denotes the identity operator on the Hilbert space ${{ \mathcal H }}_{0}$ , and $\leqslant$ denotes the standard operator order: $A\leqslant B$ iff $\langle \varphi | A| \varphi \rangle \leqslant \langle \varphi | B| \varphi \rangle$ , $\forall | \varphi \rangle \in {{ \mathcal H }}_{0}$ . A positive operator $C\in {\mathsf{Lin}}({{ \mathcal H }}_{1}\otimes {{ \mathcal H }}_{0})$ corresponds to a quantum channel if and only if it satisfies the condition

$\begin{eqnarray}&&{\Tr }_{1}[C]={I}_{0}.\end{eqnarray} \tag{ 6 }$

2.2. The link product

Two quantum operations can be connected with each other, as long as the output of the first operation matches the input of the second. At the level of Choi operators, the connection is implemented by the operation of link product [47], denoted as $\ast$ . To define the link product, it is convenient to introduct the following shorthand notation: if A is an operator on ${{ \mathcal H }}_{X}\otimes {{ \mathcal H }}_{Y}$ and B is an operator on ${{ \mathcal H }}_{Y}\otimes {{ \mathcal H }}_{Z}$ , then we use the notation AB for the product

$\begin{eqnarray}&&{AB}:= (A\otimes {I}_{Z})({I}_{X}\otimes B).\end{eqnarray} \tag{ 7 }$

With this notation, the link product of A and B is the operator $A\ast B$ defined as

$\begin{eqnarray}&&A\ast B:= {\Tr }_{Y}[A\,{B}^{{T}_{Y}}],\end{eqnarray} \tag{ 8 }$

where ${B}^{{T}_{Y}}$ denotes the partial transpose of B with respect to the Hilbert space ${{ \mathcal H }}_{Y}$ . Note that the definition of the link product presupposes that the Hilbert spaces have been labeled: in order to compute the link product, one needs to take the partial transpose and the trace on the Hilbert space in common between A and B. Mathematically, the partial transpose in the rhs of equation (8) is essential to guarantee that the link product of two positive operators is a positive operator [47]. As a counterexample, think of the case where ${{ \mathcal H }}_{X}$ , ${{ \mathcal H }}_{Y}$ , and ${{ \mathcal H }}_{Z}$ are two-dimensional and A and B are projectors on a maximally entangled state: in this case, removing the partial transpose results in a non-positive $A\ast B$ ). Physically, the role of the partial transpose can be understood in terms of entanglement swapping [57, 58]. Thanks to the partial transpose, the link product can be expressed as

$\begin{eqnarray}&&A\ast B={\Tr }_{Y}{\Tr }_{Y^{\prime} }\,[({A}_{{XY}}\otimes {B}_{Y^{\prime} Z})\,({I}_{X}\otimes | I \rangle\rangle \langle\langle I| \otimes {I}_{Z})],\end{eqnarray} \tag{ 9 }$

where $| I \rangle\rangle := {\sum }_{n=1}^{{d}_{Y}}\,| n\rangle | n\rangle$ is the unnormalized maximally entangled state on ${{ \mathcal H }}_{Y}\otimes {{ \mathcal H }}_{Y^{\prime} }$ , ${{ \mathcal H }}_{Y^{\prime} }$ being an identical copy of ${{ \mathcal H }}_{Y}$ . This means that, up to normalization, the link product $A\ast B$ is the state obtained when a Bell measurement, performed on the states $A/\Tr A$ and $B/\Tr B$ , yields the outcome corresponding to the projector $| I \rangle\rangle \langle\langle I| /{d}_{Y}$ . At the fundamental level, the possibility of representing operations as states and their composition as entanglement swapping follows from the Purification Principle—the property that every state can be obtained as the marginal of a pure state, unique up to reversible transformations [59].

The link product is associative, namely

$\begin{eqnarray*}&&A\ast (B\ast C)=(A\ast B)\ast C,\end{eqnarray*}$

for all operators A, B, and C. Moreover, the link product is commutative, up to re-ordering of the Hilbert spaces: in formula,

$\begin{eqnarray*}&&A\ast B\simeq B\ast A,\end{eqnarray*}$

having used the notation $A\ast B\simeq B\ast A$ to mean $A\ast B={{\mathtt{SWAP}}}_{{XZ}}\,(\,B\ast A\,)\,{{\mathtt{SWAP}}}_{{XZ}}$ , where ${{\mathtt{SWAP}}}_{{XZ}}$ is the unitary operator that swaps the spaces ${{ \mathcal H }}_{X}$ and ${{ \mathcal H }}_{Z}$ . From now on we will omit the swaps, implicitly understanding that the Hilbert spaces have been reordered in the right way wherever needed.

Using the above notation, we have the following

Proposition 1 [47]. Let ${ \mathcal A }$ be a quantum operation transforming operators on ${{ \mathcal H }}_{0}$ to operators on ${{ \mathcal H }}_{1}$ , let ${ \mathcal B }$ be a quantum operation transforming operators on ${{ \mathcal H }}_{1}$ to operators on ${{ \mathcal H }}_{2}$ , and let ${ \mathcal C }={ \mathcal B }{ \mathcal A }$ be the quantum operation resulting from the composition of ${ \mathcal A }$ and ${ \mathcal B }$ . Then, one has

$\begin{eqnarray*}&&C:= A\ast B,\end{eqnarray*}$

where $A,B$ , and $C$ are the Choi operators of $A,B$ , and $C$ , respectively.

In the next paragraph we will use the link product to construct the Choi operator of quantum networks consisting of multiple interconnected quantum operations.

2.3. Quantum causal networks and quantum combs

A quantum network is a collection of quantum devices connected with each other. We will call the network causal if there are no loops connecting the output of a device to the output of the same device. Mathematically, a quantum causal network can be represented by a direct acyclic graph, where each vertex of the graph corresponds to a quantum device—see figure 2. For every DAG, one can always define a total ordering of the vertices, through a procedure known as topological sorting [60]. Using this fact, one can always represent the a quantum causal network as an ordered sequence of quantum devices, such as

where ${A}_{j}^{\mathrm{in}}$ ( ${A}_{j}^{\mathrm{out}}$ ) denotes the input (output) system of the network at the jth time step.

**Figure 2.** A quantum causal network is a directed acyclic graph, whose nodes (orange boxes in the picture) represent quantum devices and whose directed edges indicate the input/output direction.
Download figure:
Standard image High-resolution image

We say that a network is deterministic if all devices in the network are deterministic, i.e. if they are described by quantum channels. Using the link product, we associate a Choi operator to the network: specifically, if the individual channels in the network have Choi operators ${C}_{1},{C}_{2},...,{C}_{N}$ , then the network has Choi operator

$\begin{eqnarray}&&C={C}_{1}\ast {C}_{2}\ast {C}_{3}\ast \cdots \ast {C}_{N}.\end{eqnarray} \tag{ 11 }$

The Choi operator of a deterministic network is called a quantum comb [18, 47], or also a quantum strategy [17]. The quantum comb C is a positive operator on ${\bigotimes }_{j=1}^{N}({{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})$ , where ${{ \mathcal H }}_{j}^{\mathrm{in}}$ ( ${{ \mathcal H }}_{j}^{\mathrm{out}}$ ) is the Hilbert space of system ${A}_{j}^{\mathrm{in}}$ ( ${A}_{j}^{\mathrm{out}}$ ). Quantum combs can be characterized as follows:

[17, 18, 47].

Proposition 2 A positive operator $C$ is a quantum comb if and only if it satisfies the linear constraints

$\begin{eqnarray}&&{\Tr }_{{A}_{n}^{\mathrm{out}}}[{C}^{(n)}]={I}_{{A}_{n}^{\mathrm{in}}}\otimes {C}^{(n-1)}\qquad \forall n\in \{1,\ldots ,N\},\end{eqnarray} \tag{ 12 }$

where ${\Tr }_{A}$ is the partial trace over the Hilbert space ${{ \mathcal H }}_{A}$ , ${C}^{(n)}$ is a suitable operator on ${{ \mathcal H }}_{n}:= {\bigotimes }_{j=1}^{n}({{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})$ , ${C}^{(N)}:= C$ , and ${C}^{(0)}:= 1$ .

The constraints in equation (12) are a direct consequence of the normalization condition of quantum channels, expressed by equation (6). Physically, the positive operator ${C}^{(n)}$ represents the subnetwork transforming the first n inputs to the first n outputs. We denote by

$\begin{eqnarray*}&&{\mathsf{Comb}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}},{A}_{2}^{\mathrm{in}}\to {A}_{2}^{\mathrm{out}},...,{A}_{N}^{\mathrm{in}}\to {A}_{N}^{\mathrm{out}})\end{eqnarray*}$

the set of positive semidefinite operators satisfying the constraint (12). When there is no ambiguity, we will simply write ${\mathsf{Comb}}$ .

2.4. Quantum testers and the generalized Born rule

So far we considered deterministic networks, resulting from the connection of quantum channels. However, it is also useful to consider networks containing measurement devices, which may generate random outcomes. We call such networks non-deterministic. Non-deterministic quantum networks can be thought as the quantum version of classical electric networks containing measurement devices, such as voltmeters and ammeters. Like these classical relatives are useful for testing the behavior of electrical circuits, quantum non-deterministic networks are useful for testing the behavior of quantum circuits, or, slightly more broadly, physical processes consisting of multiple time steps.

An example of non-deterministic network is the following

where ρ is a quantum state, $({{ \mathcal D }}_{1},...,{{ \mathcal D }}_{N-1})$ is a sequence of quantum channels, and ${\{{P}_{x}\}}_{x\in {\mathsf{X}}}$ is a positive operator-valued measure (POVM), describing a quantum measurement on the last output system. Networks of the type (13) can be used to probe quantum networks of the type (10), as follows

When the two networks are wired together, the final measurement produces one of the outcomes in the set ${\mathsf{X}}$ . Using proposition 1, the probability of the outcome x is can be computed as

$\begin{eqnarray}{p}_{x} & = & \rho \ast {C}_{1}\ast {D}_{1}\ast {C}_{2}\ast {D}_{2}\ast \cdots \ast {D}_{N-1}\ast {C}_{N}\ast {P}_{x}^{T}\\ & = & (\rho \ast {D}_{1}\ast {D}_{2}\ast \cdots \ast {D}_{N-1}\ast {P}_{x}^{T})\,\ast (\,{C}_{1}\ast {C}_{2}\ast \cdots \ast {C}_{N}^{\phantom{\dagger }})\\ & = & {T}_{x}\ast C\\ & = & \Tr [\,{T}_{x}\,{C}^{T}\,],\end{eqnarray} \tag{ 15 }$

where C is the Choi operator of the tested network, C^T is the transpose of C, and ${\{{T}_{x}\}}_{x\in {\mathsf{X}}}$ is the collection of operators defined by

$\begin{eqnarray}&&{T}_{x}:= \rho \ast {D}_{1}\ast {D}_{2}\ast \cdots \ast {D}_{N-1}\ast {P}_{x}^{T}\end{eqnarray} \tag{ 16 }$

(here the transpose of P_x is needed because, according to definition 4, the Choi operator of the quantum operation ${{ \mathcal Q }}_{x}(\cdot )=\Tr [{P}_{x}\cdot ]$ is ${P}_{x}^{T}$ instead of P_x).

We call the set of operators ${\bf{T}}={\{{T}_{x}\}}_{x\in {\mathsf{X}}}$ a quantum tester and equation (15) the generalized Born rule [18, 61, 62]. The quantum tester ${\bf{T}}$ describes the response of the non-deterministic network (13) when connected to external devices. Quantum testers are a useful abstraction in many applications, such as quantum games [17] and cryptographic protocols [19, 20], quantum interactive proof systems [46], quantum learning of gates [63–65], quantum channel discrimination [61, 66, 67], incompatibility of multitime quantum measurements [68], tomography of quantum channels [62, 69], non-Markovian processes [70, 71], and causal models [28].

Quantum testers can be characterized as follows:

[61].

Proposition 3 Let ${\bf{T}}$ be a collection of positive operators on ${\bigotimes }_{j=1}^{N}({{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})$ . ${\bf{T}}$ is a quantum tester if and only if

$\begin{eqnarray}\displaystyle \sum _{x\in {\mathsf{X}}}\,{T}_{x} & = & {I}_{{A}_{N}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N)}\\ {\Tr }_{{A}_{n}^{\mathrm{in}}}[\,{{\rm{\Gamma }}}^{(n)}\,] & = & {I}_{{A}_{n-1}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(n-1)},\quad n=2,\ldots ,N\\ {\Tr }_{{A}_{1}^{\mathrm{in}}}[\,{{\rm{\Gamma }}}^{(1)}\,] & = & 1,\end{eqnarray} \tag{ 17 }$

where each ${{\rm{\Gamma }}}^{(n)}$ , $n=1,\ldots ,N$ is a positive operator on ${{ \mathcal H }}_{n}^{\mathrm{in}}\otimes [{\bigotimes }_{j=1}^{n-1}({{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})]$ .

2.5. Assessing the performance of a quantum network

Suppose that we are given black box access to a quantum network, whose internal functioning is unknown to us. Our goal is to assess how well the network fares in a desired task, such as solving a desired computational problem [72], estimating an unknown parameter [64, 65], emulating a sequence of gates [63, 73], or replicating the action of a desired gate [74–78].

For example, suppose that a manufacturer provides us with a special-purpose computer, designed to implement a quantum search algorithm. How can we test the performance of our device? Since the computer is claimed to find the location of an item in a list, a natural approach is to place the item in a set of random positions and then to check whether the answer provided by the computer is correct. A simple measure of performance is given the number of inputs on which the computer gives the right answer. More generally, one can assign different scores depending on the distance between the correct answer and the output of the computer. Let us consider this example in more detail, as a concrete illustration of what it means to test a quantum network. Suppose that the computer attempts at reproducing Grover's algorithm [79], by interacting with unitary gates ${U}_{i}=2| i\rangle \langle i| -I$ that encode the position of an item i in a list of K items. A possible test, illustrated in figure 3, is as follows:

(i)
Prepare a 'position register' in the maximally mixed state $\rho =I/K$ .
(ii)
Upon receiving an input from the computer, apply the control-unitary gate $W={\sum }_{i=1}^{K}\,| i\rangle \langle i| \otimes {U}_{i}$ to the position register and the input.
(iii)
Repeat the previous operation until the computer returns an output. In this way, the input provided by the computer is processed by a gate U_i, with i chosen at random.
(iv)
Compare the output with the actual position, by performing a joint measurement to the position register and the output register. The measurement is described by the POVM ${\{{P}_{x}\}}_{x=-K}^{K}$ with operators given by
$\begin{eqnarray*}&&{P}_{x}=\displaystyle \sum _{i=\max \{0,-x\}}^{\min \{K,K-x\}}| i\rangle \langle i| \otimes | i+x\rangle \langle i+x| .\end{eqnarray*}$
In this way, the measurement outcome returns the deviation x from the correct position
(v)
If the deviation is x, assign score ${\omega }_{x}=1-| x| /K$ .

**Figure 3.** A computer is designed to implement Grover's algorithm. The action of the computer (in orange) is tested by a testing circuit (in blue), consisting in the preparation of randomly chosen input (encoded in the state $\rho =I/K$ of a 'position register'), followed by the application of the control-unitary gate W, which, depending on the input, performs one of the unitaries U_i. In the end, the computer outputs an outcome j (encoded in the output of the channel ${{ \mathcal C }}_{N}$ ), which is compared with the position register through a suitable quantum measurement (POVM $\{{P}_{x}\}$ ), which outputs the deviation $x=i-j$ .
Download figure:
Standard image High-resolution image

**Figure 3.** A computer is designed to implement Grover's algorithm. The action of the computer (in orange) is tested by a testing circuit (in blue), consisting in the preparation of randomly chosen input (encoded in the state $\rho =I/K$ of a 'position register'), followed by the application of the control-unitary gate W, which, depending on the input, performs one of the unitaries U_i. In the end, the computer outputs an outcome j (encoded in the output of the channel ${{ \mathcal C }}_{N}$ ), which is compared with the position register through a suitable quantum measurement (POVM $\{{P}_{x}\}$ ), which outputs the deviation $x=i-j$ .
Download figure:
Standard image High-resolution image

Mathematically, the test is represented by the quantum tester ${\{{T}_{x}\}}_{x=-K}^{K}$ with

$\begin{eqnarray*}&&{T}_{x}=\rho \ast | W \rangle\rangle \langle\langle W| \ast \cdots \ast | W \rangle\rangle \langle\langle W| \ast {P}_{x}^{T}.\end{eqnarray*}$

The sequence of operations performed by the computer is represented by the quantum comb

$\begin{eqnarray*}&&C=| \phi \rangle \langle \phi | \ast {C}_{1}\ast \cdots \ast {C}_{N},\end{eqnarray*}$

and the probability of finding a deviation x is given by

$\begin{eqnarray*}&&{p}_{x}={T}_{x}\ast C\,.\end{eqnarray*}$

The average score obtained by the computer can be expressed as

$\begin{eqnarray*}\omega & = & \displaystyle \sum _{i,j=0}^{K}\left(1-\displaystyle \frac{| i-j| }{K}\right)\,{p}_{i-j}\\ & = & \displaystyle \sum _{x=-K}^{K}\left(1-\displaystyle \frac{| x| }{K}\right)\,{T}_{x}\ast C\\ & = & {\rm{\Omega }}\ast C,\end{eqnarray*}$

where Ω is the operator ${\rm{\Omega }}:= {\sum }_{x}\left(1-\tfrac{| x| }{K}\right)\,{T}_{x}$ .

Generalizing the above example, we assess the performance of an unknown quantum network by referring to experiments where the unknown network is connected to a 'testing network', containing measuring devices. The testing network will return an outcome x, to which one can assign a 'score' ${\omega }_{x}$ . In this way, the expected score serves as an operational measured of performance. Specifically, let C be the quantum comb describing the tested network and let ${\bf{T}}=\{{T}_{x},x\in {\mathsf{X}}\}$ be the quantum tester describing the testing network. Then, the average score is given by

$\begin{eqnarray}\omega & = & \displaystyle \sum _{x}\,{\omega }_{x}\,(\,{T}_{x}\ast C\,)\\ & = & {\rm{\Omega }}\ast C\qquad {\rm{\Omega }}:= \displaystyle \sum _{x}\,{\omega }_{x}\,{T}_{x}.\end{eqnarray} \tag{ 18 }$

Note that the performance of the network C is completely determined by the operator Ω, which we call the performance operator.

For a given performance operator Ω, the maximum expected score is given by

$\begin{eqnarray}{\omega }_{\max } & := & \underset{C\in {\mathsf{Comb}}}{\max }\,{\rm{\Omega }}\ast C\\ & = & \underset{C\in {\mathsf{Comb}}}{\max }\,\Tr [{\rm{\Omega }}\,{C}^{T}]\\ & = & \underset{C\in {\mathsf{Comb}}}{\max }\,\Tr [{\rm{\Omega }}\,C].\end{eqnarray} \tag{ 19 }$

The third equality comes from the fact that the set of quantum combs is closed under transposition and, therefore, we can omit the transpose in equation (15). Using the notation

$\begin{eqnarray}&&\langle A,B\rangle := \Tr [{AB}],\end{eqnarray} \tag{ 20 }$

we express the maximum score as

$\begin{eqnarray}&&{\omega }_{\max }:= \underset{C\in {\mathsf{Comb}}}{\max }\,\langle {\rm{\Omega }},C\rangle .\end{eqnarray} \tag{ 21 }$

The above equation shows that the search for the maximum score is a SDP. The basic tools needed to address it will be reviewed in the next section.

3. SDP and the max relative entropy

3.1. Basic facts about SDP

Here we review the background about SDP. For further details, we refer the reader to Watrous' lecture notes [80].

Let ${ \mathcal X }$ and ${ \mathcal Y }$ be two a Hilbert spaces and let ${\mathsf{Herm}}({ \mathcal X })$ be the space of Hermitian operators on ${ \mathcal X }$ and ${ \mathcal Y }$ , respectively.

Definition 1. A SDP is a triple $(\phi ,A,B)$ , where $A$ and $B$ are operators in ${\mathsf{Herm}}({ \mathcal X })$ and ${\mathsf{Herm}}({ \mathcal Y })$ , respectively, and ϕ is a linear map from ${\mathsf{Herm}}({ \mathcal X })$ to ${\mathsf{Herm}}({ \mathcal Y })$ .

A SDP is associated to an optimization problem in the standard form

$\begin{eqnarray}&&\mathrm{maximize}\ \langle A,X\rangle \\ &&\mathrm{subject}\,\mathrm{to}\ \phi (X)=B\\ &&X\geqslant 0.\end{eqnarray} \tag{ 22 }$

This problem is known as the primal. The dual problem is

$\begin{eqnarray}&&\mathrm{minimize}\ \langle B,Y\rangle \\ &&\mathrm{subject}\,\mathrm{to}\ {\phi }^{\dagger }(Y)\geqslant A\\ &&Y\in {\mathsf{Herm}}({ \mathcal Y }),\end{eqnarray} \tag{ 23 }$

where ${\phi }^{\dagger }$ is the adjoint of ϕ, namely the linear map defined by the relation

$\begin{eqnarray*}&&\langle X,{\phi }^{\dagger }(Y)\rangle =\langle \phi (X),Y\rangle ,\qquad \forall X\in {\mathsf{Herm}}({ \mathcal X }),\forall Y\in {\mathsf{Herm}}({ \mathcal Y }).\end{eqnarray*}$

The optimal values of the primal and dual problems, denoted as

$\begin{eqnarray*}&&{\omega }_{\mathrm{primal}}:= \sup \ \langle A,X\rangle \qquad \mathrm{and}\qquad {\omega }_{\mathrm{dual}}:= \inf \ \langle B,Y\rangle ,\end{eqnarray*}$

are related by duality: for every SDP, one has the weak duality ${\omega }_{\mathrm{primal}}\leqslant {\omega }_{\mathrm{dual}}$ . The strong duality ${\omega }_{\mathrm{primal}}={\omega }_{\mathrm{dual}}$ holds under suitable conditions, provided by Slater's theorem [81]. In this paper we will use the following.

Proposition 4. Let $(\phi ,A,B)$ be a SDP. If there exists a positive operator $X$ satisfying $\phi (X)=B$ and an Hermitian operator $Y$ satisfying ${\phi }^{\dagger }(Y)\gt A$ , then ${\omega }_{\mathrm{primal}}={\omega }_{\mathrm{dual}}$ .

For the proof, see e. g. [80].

3.2. The max relative entropy

An important quantity in one-shot quantum information theory is the max relative entropy, introduced by Datta in [82]:

Definition 2. Let $A$ and $B$ be two positive operators on ${ \mathcal X }$ . The max entropy of A relative to $B$ is given by

$\begin{eqnarray}&&{D}_{\max }(A\,\parallel \,B):= -\mathrm{log}\,\max \{w\,| \,w\,A\leqslant B\},\end{eqnarray} \tag{ 24 }$

with the convention $\mathrm{log}0:= -\infty$ .

The max relative entropy provides one way to quantify the deviation of A from B. More generally, it is useful to consider the deviation between A and a set of operators:

Definition 3. Let $A$ be a positive operator on ${ \mathcal X }$ and let ${\mathsf{S}}\subset {\mathsf{Herm}}({ \mathcal X })$ be a set of positive operators. The max entropy of A relative the set ${\mathsf{S}}$ , denoted as ${D}_{\max }(A\,\parallel \,{\mathsf{S}})$ , is the quantity defined by

$\begin{eqnarray}&&{D}_{\max }(A\,\parallel \,{\mathsf{S}}):= \underset{B\in {\mathsf{S}}}{\inf }\,{D}_{\max }(A\,\parallel \,B).\end{eqnarray} \tag{ 25 }$

The max relative entropy between a quantum state and a set of quantum states plays a central role in entanglement theory [83], where relative entropies are used to quantify the deviation from the set of separable states, and in quantum thermodynamics [84, 85], where relative entropies are used to quantify the deviation from the set of Gibbs states. In this paper we will extend the application of the max relative entropy to dynamical scenarios, where ${\mathsf{S}}$ represents a set of quantum networks. This extension is promising, e. g. for applications to hypothesis testing. Indeed, it is natural to consider scenarios where one has a null hypothesis on the input-output behavior of a quantum network and one wants to test the null hypothesis against an alternative hypothesis. In the case of quantum states, the minimum probability of a type II error (failing to accept the alternative hypothesis) can be estimated in terms of the max relative entropy [86]. In the case of quantum networks, it is natural to expect that the max relative entropy defined here will yield similar bounds—a result in this direction will be provided in sections 6 and 8.

3.3. From SDPs to the max relative entropy

In this section we provide a general bound on the primal value of an arbitrary SDP. The bound can always be attained and its value can be expressed in terms of a max relative entropy whenever the operator A in the SDP $(\phi ,A,B)$ is positive. To state the result, we need some basic notation, provided in the following:

For a vector space ${ \mathcal V }$ , we denote by ${{ \mathcal V }}^{* }$ the dual space, i. e. the space of linear functionals on ${ \mathcal V }$ . Given a subset ${\mathsf{S}}\subseteq { \mathcal V }$ , we define the dual affine space $\bar{{\mathsf{S}}}$ as

$\begin{eqnarray*}&&\bar{{\mathsf{S}}}:= \{{\rm{\Gamma }}\in {{ \mathcal V }}^{* }\,| \,\langle {\rm{\Gamma }},X\rangle =1,\forall X\in {\mathsf{S}}\}.\end{eqnarray*}$

Regarding ${ \mathcal V }$ as a subspace of ${{ \mathcal V }}^{\ast \ast }$ , one has the inclusion ${\mathsf{S}}\subseteq \bar{\bar{{\mathsf{S}}}}$ . When ${ \mathcal V }$ is finite dimensional and ${\mathsf{S}}$ is an affine set, one has the equality ${\mathsf{S}}=\bar{\bar{{\mathsf{S}}}}$ .

Given a SDP $(\phi ,A,B)$ , we define the primal affine space as

$\begin{eqnarray}&&{\mathsf{S}}:= \{X\in {\mathsf{Herm}}({ \mathcal X })\,| \,\phi (X)=B\}.\end{eqnarray} \tag{ 26 }$

Simply, ${\mathsf{S}}$ is the set of operators that satisfy the equality constraint of the primal problem. The dual affine space is given by

$\begin{eqnarray}&&\bar{{\mathsf{S}}}=\{{\rm{\Gamma }}\in {\mathsf{Herm}}({ \mathcal X })\,| \,\langle {\rm{\Gamma }},X\rangle =1,\forall X\in {\mathsf{S}}\},\end{eqnarray} \tag{ 27 }$

having used the identification of ${\mathsf{Herm}}({ \mathcal X })$ with its dual space. With this notation, we have

Theorem 1. Let $(\phi ,A,B)$ be a SDP. The optimal solution of the primal problem is upper bounded as

$\begin{eqnarray}&&{\omega }_{\mathrm{primal}}\leqslant \underset{{\rm{\Gamma }}\in \bar{{\mathsf{S}}}}{\inf }\,\min \{\lambda \in {\mathbb{R}}\,| \,\lambda {\rm{\Gamma }}\geqslant A\},\end{eqnarray} \tag{ 28 }$

where $\bar{{\mathsf{S}}}$ is the dual affine space defined in equation (27). If ${\mathsf{S}}$ contains a positive operator and $\bar{{\mathsf{S}}}$ contains a strictly positive operator, then equation (28) holds with the equality sign. If, in addition, the operator A is positive, then one has the expression

$\begin{eqnarray}&&{\omega }_{\mathrm{primal}}={2}^{{D}_{\max }(A\parallel {\bar{S}}_{+})},\end{eqnarray} \tag{ 29 }$

where ${\bar{{\mathsf{S}}}}_{+}$ is the dual convex set ${\bar{{\mathsf{S}}}}_{+}:= \{{\rm{\Gamma }}\in \bar{{\mathsf{S}}}\,| \,{\rm{\Gamma }}\geqslant 0\}$ .

The proof can be found in appendix A.

We call the quantity $D(A\,\parallel \,{\bar{{\mathsf{S}}}}_{+})$ the max divergence from normalization. This quantity measures how much the operator A deviates from the set of positive functionals that are normalized on every element of the primal set.

The connection between SDP and the max relative entropy has previously appeared in the special case where the task is to optimize quantum channels [51, 87]. A related result was obtained by Jenčová in the framework of base norms [88]. In the next sections we will elaborate on the physical meaning of theorem 1, which will be applied to the optimization of quantum networks, both with definite and indefinite causal structure. Before specializing ourselves to quantum networks, however, it is worth emphasizing a simple connection between the max relative entropy arising in generic SDPs and the max relative entropy of quantum states.

Proposition 5. Let ${C}_{0}$ and ${C}_{1}$ be two elements of the convex set

$\begin{eqnarray*}&&{{\mathsf{S}}}_{+}=\{X\in {\mathsf{Herm}}({ \mathcal X })\,| \,\phi (X)=B,X\geqslant 0\}.\end{eqnarray*}$

Then, one has the bound

$\begin{eqnarray*}&&{D}_{\max }(\sqrt{{\rm{\Gamma }}}{C}_{0}\sqrt{{\rm{\Gamma }}}\,\parallel \,\sqrt{{\rm{\Gamma }}}{C}_{1}\sqrt{{\rm{\Gamma }}})\leqslant {D}_{\max }({C}_{0}\,\parallel \,{C}_{1}),\qquad \forall {\rm{\Gamma }}\in {\bar{{\mathsf{S}}}}_{+}.\end{eqnarray*}$

The bound holds with the equality if the dual convex set ${\bar{{\mathsf{S}}}}_{+}$ contains a full-rank operator.

The proof can be found in appendix C. Note that, by construction the operators $\sqrt{{\rm{\Gamma }}}{C}_{i}\sqrt{{\rm{\Gamma }},}$ $i=0,1$ are density matrices: indeed, they are positive and $\Tr [\sqrt{{\rm{\Gamma }}}{C}_{i}\sqrt{{\rm{\Gamma }}}]=\Tr [{\rm{\Gamma }}{C}_{i}]=1$ , since, by definition Γ is a positive function normalized on the primal set ${\mathsf{S}}$ . Proposition will be used to show that the relative entropy of two quantum networks is equal to the maximum relative entropy between the output states generated by the networks.

4. Optimizing quantum causal networks

Here we consider the scenario where a network of quantum devices, arranged in a definite causal order, is required to perform a desired task, such as implementing a distributed algorithm. What is the maximum performance that the network can attain? In this section we answer the question, measuring the performance through the score obtained in a suitable test (depending on the task at hand) and providing a close form expression for the maximum score.

4.1. The dual networks

Following section 2.5, the mathematical description of the test is provided by a performance operator Ω, acting on the Hilbert spaces of the input and output systems of the tested network. The maximum performance achieved by an arbitrary causal network is determined by the following

Theorem 2. Let ${\rm{\Omega }}$ be an operator on ${\bigotimes }_{j=1}^{N}(\,{{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})$ and let ${\omega }_{\max }$ be the maximum of $\langle {\rm{\Omega }},C\rangle$ over all operators C representing quantum networks of the form

Then, ${\omega }_{\max }$ is given by

$\begin{eqnarray}&&{\omega }_{\max }=\underset{{\rm{\Gamma }}\in {\mathsf{DualComb}}}{\min }\,\min \{\lambda \in {\mathbb{R}}\,| \,\lambda {\rm{\Gamma }}\geqslant {\rm{\Omega }}\},\end{eqnarray} \tag{ 31 }$

where ${\mathsf{DualComb}}$ denotes the set of dual combs, that is, positive operators Γ representing networks of the form

where $\sigma$ is a quantum state, $({{ \mathcal E }}_{1},{{ \mathcal E }}_{2},\ldots ,{{ \mathcal E }}_{N-1})$ is a sequence of quantum channels, and ${\Tr }_{{A}_{N}^{\mathrm{out}}}$ represents the trace over the last system. Explicitly, ${\mathsf{DualComb}}$ is the set of all positive operators ${\rm{\Gamma }}$ satisfying the linear constraint

$\begin{eqnarray}{\rm{\Gamma }} & = & {I}_{{A}_{N}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N)}\\ {\Tr }_{{A}_{n}^{\mathrm{in}}}[{{\rm{\Gamma }}}^{(N)}] & = & {I}_{{A}_{n-1}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N-1)},\quad n=2,\ldots ,N\\ {\Tr }_{{A}_{1}^{\mathrm{in}}}[{{\rm{\Gamma }}}^{(1)}] & = & 1,\end{eqnarray} \tag{ 33 }$

for suitable positive operators ${{\rm{\Gamma }}}^{(n)}$ acting on ${{ \mathcal H }}_{n}^{\mathrm{in}}\otimes [{\bigotimes }_{j=1}^{n-1}(\,{{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})]$ . When ${\rm{\Omega }}$ is positive, the maximum performance can be expressed as

$\begin{eqnarray}&&{\omega }_{\max }={2}^{{D}_{\max }({\rm{\Omega }}\parallel {\mathsf{DualComb}})}.\end{eqnarray} \tag{ 34 }$

The proof can be found in appendix B.

Theorem 2 has an intuitive interpretation. The dual networks (32) and the primal networks (30) 'deterministically complement each other': when two such networks are connected, one obtains the closed circuit

which yields no information about the primal network and makes any such information inaccessible to further tests. Hence, the dual networks represent the non-informative tests. The max relative entropy quantifies how much the test with performance operator Ω deviates from the set of non-informative tests.

4.2. The case of binary testers

Consider a binary test, described by the tester $\{{T}_{\mathrm{yes}},{T}_{\mathrm{no}}\}$ and assume that the test is passed if and only if the testing network yields the outcome ' $\mathrm{yes}$ '. Binary testers have applications in the theory of quantum interactive proof systems [46], where they can be used to compute the probability that the verifier accepts the token provided by the prover through a sequence of operations. In this scenario, the performance operator is given by ${\rm{\Omega }}={T}_{\mathrm{yes}}$ and the probability that the prover passes the test, optimized over all possible quantum strategies, is

$\begin{eqnarray}&&{p}_{\max }=\,{(\underset{{\rm{\Gamma }}\in {\mathsf{DualComb}}}{\max }\max \{w{T}_{\mathrm{yes}}\leqslant {\rm{\Gamma }}\})}^{-1},\end{eqnarray} \tag{ 36 }$

having used equation (31) with λ replaced by its inverse $w=1/\lambda$ . In words, the problem is to find the maximum weight for which one can squeeze the tester operator ${T}_{\mathrm{yes}}$ under some dual comb Γ.

This maximization has an intuitive interpretation:

Corollary 1. The maximum probability that a quantum causal network passes the test defined by the operator ${T}_{1}$ is equal to the inverse of the maximum weight w for which there exists a two-outcome tester $\{{T}_{\mathrm{yes}}^{\prime },{T}_{\mathrm{no}}^{\prime }\}$ satisfying ${T}_{\mathrm{yes}}^{\prime }=w\,{T}_{1}$ .

Proof. Suppose that the relation ${{wT}}_{1}\leqslant {\rm{\Gamma }}$ holds for some weight w and some dual comb Γ. Then, define ${T}_{\mathrm{yes}}^{\prime }:= {{wT}}_{\mathrm{yes}}$ and ${T}_{\mathrm{no}}^{\prime }:= {\rm{\Gamma }}-{T}_{\mathrm{no}}^{\prime }$ . By construction, the operators $\{{T}_{\mathrm{yes}}^{\prime },{T}_{\mathrm{no}}^{\prime }\}$ form a tester: they are positive and their sum satisfies equation (17).□

In other words, the dual problem amounts to finding the binary tester $\{{T}_{\mathrm{yes}}^{* },{T}_{\mathrm{no}}^{* }\}$ that assigns the maximum possible probability to the outcome 1, subject to the condition that ${T}_{\mathrm{yes}}^{* }$ is proportional to ${T}_{\mathrm{yes}}$ . The content of the duality is that the maximum is attained when there exists a primal network that triggers deterministically the outcome 1:

Corollary 2. Let $\{{T}_{\mathrm{yes}}^{* },{T}_{\mathrm{no}}^{* }\}$ be the optimal tester for the dual problem and let ${C}^{* }$ be the optimal quantum comb for the primal problem. Then, one has

$\begin{eqnarray*}&&\langle {T}_{\mathrm{yes}}^{* },{C}^{* }\rangle =1.\end{eqnarray*}$

Proof. Let ${w}^{* }$ be the optimal weight in the dual problem, Then, one has ${T}_{\mathrm{yes}}^{* }={w}^{* }\,{T}_{\mathrm{yes}}$ and $\langle {T}_{\mathrm{yes}},{C}^{* }\rangle =1/{w}^{* }$ . Combining these two equations, one gets $\langle {T}_{\mathrm{yes}}^{* },{C}^{* }\rangle ={w}^{* }\,\langle {T}_{\mathrm{yes}},{C}^{* }\rangle =1$ . □

5. The conditional min-entropy of quantum causal networks

Theorem 2 allows us to extend the notion of conditional min-entropy [50] from quantum states to quantum causal networks. Let us first review the basic properties of the conditional min-entropy of quantum states: For a quantum state $\rho \in {\mathsf{St}}({AB})$ , the conditional min-entropy of system A, conditional on system B, is defined as [50]

$\begin{eqnarray}&&{H}_{\min }{(A| B)}_{\rho }:= -\mathrm{log}[\underset{\gamma \in {\mathsf{St}}(B)}{\min }\min \{\lambda \in {\mathbb{R}}\,| \,\lambda ({I}_{A}\otimes {\gamma }_{B})\geqslant {\rho }_{{AB}}\}].\end{eqnarray} \tag{ 37 }$

König et al [51] clarified the operational meaning of ${H}_{\min }{(A| B)}_{\rho }$ in terms of the following task: given the state ${\rho }_{{AB}}$ , find the quantum channel ${ \mathcal C }$ that produces the best approximation of the maximally entangled state $| {\rm{\Phi }}{\rangle }_{{AA}^{\prime} }:= {\sum }_{n=1}^{{d}_{A}}\,| n\rangle | n\rangle /{\sqrt{d}}_{A}$ , by acting locally on system B. Here the quality of the approximation is measured by the fidelity, namely the probability that the output state passes a binary test with POVM $\{{P}_{\mathrm{yes}},{P}_{\mathrm{no}}\}$ , defined by ${P}_{\mathrm{yes}}:= | {\rm{\Phi }}\rangle \langle {\rm{\Phi }}|$ . Overall, we can jointly regard the preparation of the state ρ and the measurement of the binary POVM $\{{P}_{\mathrm{yes}},{P}_{\mathrm{no}}\}$ as a test performed on the channel ${ \mathcal C }$ . Diagrammatically, the successful instance of the test is represented by the network

whose Choi operator is given by

$\begin{eqnarray*}{T}_{\mathrm{yes}} & := & \rho \ast {P}_{\mathrm{yes}}^{T}\\ & = & \rho /{d}_{A},\end{eqnarray*}$

(with a slight abuse of notation, in the second equality we regard ρ as an operator on $A^{\prime} B$ , instead of AB). Hence, the probability that the channel passes the test is

$\begin{eqnarray*}p & = & {T}_{\mathrm{yes}}\ast C\\ & = & \displaystyle \frac{\Tr [\rho \,{C}^{T}]}{{d}_{A}},\end{eqnarray*}$

where C is the Choi operator of ${ \mathcal C }$ . König, Renner, and Schaffner showed that the maximum probability over all possible channels is

$\begin{eqnarray}&&{p}_{\max }=\,\displaystyle \frac{{2}^{-{H}_{\min }{(A| B)}_{\rho }}}{{d}_{A}}.\end{eqnarray} \tag{ 39 }$

We now extend the notion of conditional min-entropy from states to networks with a definite causal structure. This can be done in two slightly different ways, illustrated in the following subsections.

5.1. The conditional min-entropy of a quantum causal network

The first way to generalize the conditional min-entropy from states is to regard ${H}_{\min }{(A| B)}_{\rho }$ as a measure of the correlations that can be extracted from the state ${\rho }_{{AB}}$ by acting on system B alone. A natural generalization to the network scenario arises if we consider a quantum network of the form

and ask how much correlation can be generated by interacting with the network in the first $N-1$ time steps. To generate the correlations, we can connect the network (40) with a second network that processes all input/output systems before ${B}_{N}^{\mathrm{out}}$ . Graphically, the second network can be described as

where ${B}_{N}^{\mathrm{out}^{\prime} }$ is a quantum system of the same dimension as ${B}_{N}^{\mathrm{out}}$ . When the two networks are connected, they generate the bipartite state

A measure of the correlations generated by the interaction of the two networks is then provided by the fidelity between the state (42) and the maximally entangled state.

Explicitly, the fidelity is given by

$\begin{eqnarray}F\,: & = & \langle {\rm{\Phi }}| (\sigma \ast {D}_{1}\ast {E}_{1}\ast \cdots \ast {D}_{N-1}\ast {E}_{N-1}\ast {D}_{N})\,| {\rm{\Phi }}\rangle \\ & = & \displaystyle \frac{\Tr [D\,{E}^{T}]}{{d}_{{B}_{N}^{\mathrm{out}}}},\end{eqnarray} \tag{ 43 }$

with

$\begin{eqnarray*}&&D:= {D}_{1}\ast \cdots \ast {D}_{N}\qquad \mathrm{and}\qquad E:= \sigma \ast {E}_{1}\ast \cdots \ast {E}_{N-1}\end{eqnarray*}$

(with a little abuse of notation, in the second equality we regard E as an operator on ${{ \mathcal H }}_{{B}_{1}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{B}_{1}^{\mathrm{out}}}\otimes \cdots \,\otimes {{ \mathcal H }}_{{B}_{N}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{B}_{N}^{\mathrm{out}}}$ instead of ${{ \mathcal H }}_{{B}_{1}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{B}_{1}^{\mathrm{out}}}\otimes \cdots \,\otimes {{ \mathcal H }}_{{B}_{N}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{B}_{N}^{\mathrm{out}^{\prime} }}$ ).

The maximum of the fidelity over all networks of the form (41) can be computed via theorem 2, which yields the expression

$\begin{eqnarray}&&{F}_{\max }=\displaystyle \frac{{\min }_{{{\rm{\Gamma }}}_{{t}_{1}...{t}_{N-1}}}\min \{\lambda \in {\mathbb{R}}\,| \,\lambda ({I}_{{t}_{N}}\otimes {{\rm{\Gamma }}}_{{t}_{1}...{t}_{N-1}})\geqslant R\}}{{d}_{{B}_{N}^{\mathrm{out}}}},\end{eqnarray} \tag{ 44 }$

where ${{\rm{\Gamma }}}_{{t}_{1}...{t}_{N-1}}$ is a generic element of ${\mathsf{Comb}}({B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}},\,...,\,{B}_{N-1}^{\mathrm{in}}\to {B}_{N-1}^{\mathrm{out}})$ and ${I}_{{t}_{N}}:= {I}_{{B}_{N}^{\mathrm{out}}}\otimes {I}_{{B}_{N}^{\mathrm{in}}}$ .

Equation (44) motivates the following.

Definition 4. Let $D\in {\mathsf{Comb}}({B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}},...,{B}_{N}^{\mathrm{in}}\to {B}_{N}^{\mathrm{out}})$ be a quantum comb and let ${t}_{j}:= {B}_{j}^{\mathrm{in}}\to {B}_{j}^{\mathrm{out}}$ be the type corresponding to the jth time step. The network min-entropy of the Nth time step, conditionally on the first $N-1$ time steps is the quantity

$\begin{eqnarray}&&{H}_{\min }{({t}_{N}| {t}_{1}\cdots {t}_{N-1})}_{D}\\ &&\quad := -\mathrm{log}[\underset{{{\rm{\Gamma }}}_{{t}_{1}...{t}_{N-1}}}{\min }\min \{\lambda \in {\mathbb{R}}\,| \,\lambda ({I}_{{t}_{N}}\otimes {{\rm{\Gamma }}}_{{t}_{1}...{t}_{N-1}})\geqslant D\}],\qquad \qquad \qquad \end{eqnarray} \tag{ 45 }$

where the first minimum is over the elements of ${\mathsf{Comb}}({B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}},\,...,\,{B}_{N-1}^{\mathrm{in}}\to {B}_{N-1}^{\mathrm{out}})$ and ${I}_{{t}_{N}}:= {I}_{{B}_{N}^{\mathrm{out}}}\otimes {I}_{{B}_{N}^{\mathrm{in}}}$ .

The above definition is a compelling generalization of the conditional min-entropy for states. First of all, it comes with a natural operational interpretation, as the maximum amount of correlations between the last output of the network and all the system involved in the previous history. Moreover, the conditional min-entropy of quantum networks is consistent with the conditional min-entropy of quantum states: Concretely, one can interpret the conditional min-entropy (45) as the maximum conditional min-entropy of the output state of the network, conditionally on an external reference system generated through the intermediate time steps. This interpretation is based on the following.

Proposition 6. For a causal network with Choi operator $D\in {\mathsf{Comb}}({B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}},...,{B}_{N}^{\mathrm{in}}\to {B}_{N}^{\mathrm{out}})$ , the min-entropy ${H}_{\min }{({t}_{N}| {t}_{1}\cdots {t}_{N-1})}_{D}$ is equal to the min-entropy of the output state $\rho \in {\mathsf{St}}({{ \mathcal H }}_{{B}_{N}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{B}_{N}^{{out}^{\prime} }})$ in equation (42) maximized over the input state σ and over the sequence of intermediate operations ${{ \mathcal E }}_{1},\ldots ,{{ \mathcal E }}_{N-1}$ .

The proof is given in appendix D. We expect that the network min-entropy defined in equation (45) will play a role in the study non-Markovian quantum evolutions, along the lines of the entropic characterization of Markovianity provided in [89, 90]. Intuitively, the idea is that one can evaluate how the correlations build up from one step to the next and use this information to infer properties of the internal memory used by the network.

5.2. The conditional min-entropy of a test

An alternative way to extend the notion of conditional min-entropy is to regard ${H}_{\min }{(A| B)}_{\rho }$ as a quantity associated to a test—specifically, the test depicted in equation (38). From this point of view, it is natural to extend the definition to tests consisting of multiple time steps, as follows

Definition 5. Let ${T}_{\mathrm{yes}}$ be a positive operator associated to a test of the form

The conditional min-entropy of the output system ${A}_{N}^{\mathrm{out}}$ , conditionally on all the previous systems is

$\begin{eqnarray}&&{H}_{\min }{({A}_{N}^{\mathrm{out}}| {A}_{1}^{\mathrm{in}}{A}_{1}^{\mathrm{out}}{A}_{2}^{\mathrm{in}}...{A}_{N}^{\mathrm{in}})}_{{T}_{\mathrm{yes}}}:= -\mathrm{log}[\underset{{{\rm{\Gamma }}}^{(N)}}{\min }\min \{\lambda \in {\mathbb{R}}\,| \,\lambda ({I}_{{A}_{N}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N)})\geqslant {T}_{\mathrm{yes}}\}],\end{eqnarray} \tag{ 47 }$

where ${{\rm{\Gamma }}}^{(N)}$ is a generic element of ${\mathsf{Comb}}(I\to {A}_{1}^{\mathrm{in}},\,{A}_{1}^{\mathrm{out}}\to {A}_{2}^{\mathrm{in}}...,\,{A}_{N-1}^{\mathrm{out}}\to {A}_{N}^{\mathrm{in}})$ , corresponding to a network of the form

The conditional min-entropy for 'states' (or, more precisely, for tests of the form (38)) can be retrieved as a special case of this definition, by setting $N=1$ , ${A}_{1}^{\mathrm{in}}=B$ , ${A}_{1}^{\mathrm{out}}=A$ , and ${T}_{\mathrm{yes}}=\rho /{d}_{A}$ . The appeal of the above definition is that it extends the definition of min-entropy to a class of probabilistic operations.

The conditional min-entropy can be interpreted operationally as the (negative logarithm of the) maximum probability that a quantum causal network passes the test ${T}_{\mathrm{yes}}$ . This interpretation follows from theorem 2, which yields the following.

Corollary 3. The maximum probability that a quantum causal network of the form

passes the test with operator ${T}_{\mathrm{yes}}$ is

$\begin{eqnarray*}&&{p}_{\max }={2}^{-{H}_{\min }{({A}_{N}^{\mathrm{out}}| {A}_{1}^{\mathrm{in}}{A}_{1}^{\mathrm{out}}{A}_{2}^{\mathrm{in}}...{A}_{N}^{\mathrm{in}})}_{{T}_{\mathrm{yes}}}}.\end{eqnarray*}$

This result secures an operational interpretation for the conditional min-entropy defined in equation (47). Quite intuitively, the conditional min-entropy of the test is a measure of how of the first $N-1$ time steps can be used to predict the outcome of the measurement performed in the last step.

6. The max relative entropy of causal networks

We conclude our study of causal networks with a result relating the max relative entropy of quantum networks to the max relative entropy of quantum states:

Proposition 7. Let ${C}^{(0)}$ and ${C}^{(1)}$ be the Choi operators of two networks

and let $E$ be the Choi operator of a network of the form

where $S$ is a generic quantum system. Then, one has

$\begin{eqnarray}&&{D}_{\max }({C}^{(0)}\,\parallel \,{C}^{(1)})=\underset{E}{\max }\,{D}_{\max }({C}^{(0)}* E\,\parallel \,{C}^{(1)}* E).\end{eqnarray} \tag{ 51 }$

where the maximum runs over all networks of the form (50), with arbitrary system S.

The proof is provided in appendix E. In words, equation (70) states that the max relative entropy between two quantum networks is equal to the max relative entropy between the output states one can generate from them. Diagrammatically, the output states are

Proposition 7 has an application to problems of hypothesis testing where the task is to distinguish between two quantum networks. Here one has access to an quantum network, that is promised to have quantum comb ${C}^{(0)}$ or ${C}^{(1)}$ . In order to determine which of these two hypotheses is correct, one has to interact with the network, by sending inputs to it and processing its outputs. In the end, these operations will result in the preparation of a quantum state, as in equation (52). At this point, the problem is to distinguish between two states ${\rho }^{(0)}$ and ${\rho }^{(1)}$ corresponding to the two hyoptheses. One-shot hypothesis testing of quantum states has been studied by Datta et al in [86], where they provided bounds on the type II error probability in terms of the max relative entropy. Proposition 7 then allows to relate the max relative entropy of the output states to the max relative entropy of the networks, opening a route to adapting the results of [86] to the study of hypothesis testing to the more general scenario.

7. Non-causal networks

In the previous sections we restricted our focus to causal networks. We will address the general scenario, concerning networks that are not compatible with any pre-defined causal order [33, 34, 37–42, 44]. Some of these networks arise when multiple quantum devices are connected in a way that is controlled by the state of a quantum system [33, 38]. Some other networks are not built from individual devices [34, 41] but may possibly arise in exotic quantum gravity scenarios. These generalized quantum networks are characterized by the way in which they interact with external quantum devices.

7.1. A bipartite example

The characterization of the non-causal networks is not as simple as in the case of causally ordered networks. We first illustrate the idea in a simple example, inspired by the work of Oreshkov et al [41]. Imagine two laboratories, A and B, where two parties, Alice and Bob perform local experiments. In each laboratory, ordinary quantum theory holds and, in particular, one can describe the time evolution by quantum channels. Specifically, let ${ \mathcal A }$ and ${ \mathcal B }$ be the quantum channels describing the evolution of the systems in laboratories A and B, respectively. Now, one can model the interactions between one laboratory and the other by a generalized quantum network, which describes the background structure of spacetime.

Concretely, suppose that, at some earlier time, system ${A}_{1}^{\mathrm{in}}$ in the first laboratory has been prepared jointly with system ${B}_{1}^{\mathrm{in}}$ in the second laboratory, and that, at a later time, system ${A}_{1}^{\mathrm{out}}$ and system ${B}_{1}^{\mathrm{out}}$ are discarded. Indulging into a bit of science fiction, one could imagine a scenario where systems ${A}_{1}^{\mathrm{in}}$ and ${B}_{1}^{\mathrm{in}}$ emerge from a wormhole at time t₀ and system ${A}_{1}^{\mathrm{out}}$ and ${B}_{1}^{\mathrm{out}}$ enter the same wormhole at time t₁. Between times t₀ and t₁ the systems ${A}_{1}^{\mathrm{in}}$ and ${B}_{1}^{\mathrm{in}}$ can interact with the other systems in Alice's and Bob's laboratories, here denoted as ${A}_{2}^{\mathrm{in}},{A}_{2}^{\mathrm{out}}$ and ${B}_{2}^{\mathrm{in}},{B}_{2}^{\mathrm{out}}$ , respectively. The interaction is controlled locally by Alice and Bob, who implement the channels ${ \mathcal A }$ and ${ \mathcal B }$ , as illustrated in figure 4. The connection of Alice's and Bob's laboratories through the background spacetime structure can be described as a map

$\begin{eqnarray}&&{ \mathcal S }:{ \mathcal A }\otimes { \mathcal B }\mapsto { \mathcal S }({ \mathcal A }\otimes { \mathcal B }),\end{eqnarray} \tag{ 53 }$

which transforms the quantum channels ${ \mathcal A }$ and ${ \mathcal B }$ into a new quantum channel ${ \mathcal S }({ \mathcal A }\otimes { \mathcal B })$ . Maps that transform channels into channels are known as quantum supermaps [18, 53]. The basic requirements for quantum supermaps are linearity, complete positivity, and normalization. In this setting, linearity means that one has

$\begin{eqnarray}&&{ \mathcal S }\left(\displaystyle \sum _{i}\,{p}_{i}\,{{ \mathcal A }}_{i}\otimes {{ \mathcal B }}_{i}\right)=\displaystyle \sum _{i}\,{p}_{i}\,{ \mathcal S }({{ \mathcal A }}_{i}\otimes {{ \mathcal B }}_{i}),\end{eqnarray} \tag{ 54 }$

for every choice of coefficients $\{{p}_{i}\}$ . The standard motivation for linearity comes from the requirement that convex combinations of input channels (generated by Alice and Bob by sharing random bits) be mapped into convex combinations of the corresponding outputs.

**Figure 4.** The quantum channels ${ \mathcal A }$ and ${ \mathcal B }$ in Alice's and Bob's laboratory interact through a quantum network ${ \mathcal C }$ , describing the interactions mediated by the background spacetime. Here, ${ \mathcal A }$ ( ${ \mathcal B }$ ) is a bipartite channel transforming the input systems ${A}_{1}^{\mathrm{in}}{A}_{2}^{\mathrm{in}}$ ( ${B}_{1}^{\mathrm{in}}{B}_{2}^{\mathrm{in}}$ ) into the output systems ${A}_{1}^{\mathrm{out}}{A}_{2}^{\mathrm{out}}$ ( ${B}_{1}^{\mathrm{out}}{B}_{2}^{\mathrm{out}}$ ). The connection between the channels take places only through the systems ${A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}}$ and ${B}_{1}^{\mathrm{out}}$ , while systems ${A}_{2}^{\mathrm{in}},{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{in}}$ and ${B}_{2}^{\mathrm{out}}$ do not interact directly.
Download figure:
Standard image High-resolution image

**Figure 4.** The quantum channels ${ \mathcal A }$ and ${ \mathcal B }$ in Alice's and Bob's laboratory interact through a quantum network ${ \mathcal C }$ , describing the interactions mediated by the background spacetime. Here, ${ \mathcal A }$ ( ${ \mathcal B }$ ) is a bipartite channel transforming the input systems ${A}_{1}^{\mathrm{in}}{A}_{2}^{\mathrm{in}}$ ( ${B}_{1}^{\mathrm{in}}{B}_{2}^{\mathrm{in}}$ ) into the output systems ${A}_{1}^{\mathrm{out}}{A}_{2}^{\mathrm{out}}$ ( ${B}_{1}^{\mathrm{out}}{B}_{2}^{\mathrm{out}}$ ). The connection between the channels take places only through the systems ${A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}}$ and ${B}_{1}^{\mathrm{out}}$ , while systems ${A}_{2}^{\mathrm{in}},{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{in}}$ and ${B}_{2}^{\mathrm{out}}$ do not interact directly.
Download figure:
Standard image High-resolution image

**Figure 5.** Schematic of a test for probing two different hypotheses of quantum spacetime. The two hypotheses are described by the (possibly non-causal) network (in blue) connecting systems ${A}^{\mathrm{in}}$ and ${A}_{1}^{\mathrm{out}}$ in Alice's laboratory with systems ${B}^{\mathrm{in}}$ and ${B}^{\mathrm{out}}$ in Bob's laboratory. The test consists in applying a quantum channel ${ \mathcal E }$ (in orange), acting on systems ${A}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}^{\mathrm{in}}$ , and ${B}^{\mathrm{out}}$ , plus an additional system S. The channel ${ \mathcal E }$ has the property that, once system S is discarded, the evolution of the remaining systems is no-signaling. We stress that the above model represents the most general way—*in principle*—to discriminate between two hypotheses of causal structure. However, depending on the situation, there may be constraints on the channel ${ \mathcal E }$ , such as the ability to implement ${ \mathcal E }$ with local interactions, or the presence of conservation laws that further limit the set of available channels.
Download figure:
Standard image High-resolution image

**Figure 5.** Schematic of a test for probing two different hypotheses of quantum spacetime. The two hypotheses are described by the (possibly non-causal) network (in blue) connecting systems ${A}^{\mathrm{in}}$ and ${A}_{1}^{\mathrm{out}}$ in Alice's laboratory with systems ${B}^{\mathrm{in}}$ and ${B}^{\mathrm{out}}$ in Bob's laboratory. The test consists in applying a quantum channel ${ \mathcal E }$ (in orange), acting on systems ${A}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}^{\mathrm{in}}$ , and ${B}^{\mathrm{out}}$ , plus an additional system S. The channel ${ \mathcal E }$ has the property that, once system S is discarded, the evolution of the remaining systems is no-signaling. We stress that the above model represents the most general way—*in principle*—to discriminate between two hypotheses of causal structure. However, depending on the situation, there may be constraints on the channel ${ \mathcal E }$ , such as the ability to implement ${ \mathcal E }$ with local interactions, or the presence of conservation laws that further limit the set of available channels.
Download figure:
Standard image High-resolution image

Regarding complete positivity, it can be motivated by the local form of the interactions. Since the interaction between Alice's and Bob's laboratory takes place only through systems A₁ and B₁, it is natural to assume that the supermap ${ \mathcal S }$ acts non-trivially only on these systems, as

$\begin{eqnarray}&&{ \mathcal S }({ \mathcal A }\otimes { \mathcal B })=({{ \mathcal I }}_{{A}_{2}\to {A}_{2}}\otimes { \mathcal C }\otimes {{ \mathcal I }}_{{B}_{2}\to {B}_{2}})({ \mathcal A }\otimes { \mathcal B }),\end{eqnarray} \tag{ 55 }$

where ${{ \mathcal I }}_{{A}_{2}\to {A}_{2}}$ ( ${{ \mathcal I }}_{{B}_{2}\to {B}_{2}}$ ) is the identity supermap, acting trivially on the channels with input A₂ⁱⁿ (B₂ⁱⁿ) and output A₂^out (B₂^out), and ${ \mathcal C }$ is a supermap that annihilates channels with input ${A}_{1}^{{\rm{i}}{\rm{n}}}{{B}_{}}_{1}^{{\rm{i}}{\rm{n}}}$ and output ${{A}_{}}_{1}^{{\rm{o}}{\rm{u}}{\rm{t}}}{B}_{1}^{{\rm{o}}{\rm{u}}{\rm{t}}}$ . Physically, the map ${ \mathcal C }$ represents the piece of spacetime connecting ${A}_{1}^{\mathrm{in}}$ and B₁ⁱⁿ with A₁^out and ${B}_{1}^{\mathrm{out}}$ .

7.2. Choi operator formulation

Since all the maps ${ \mathcal A },{ \mathcal B }$ , and ${ \mathcal C }$ are completely positive, one can represent them with Choi operators A, B, and C, respectively. In terms of Choi operators, equation (55) can be expressed as

$\begin{eqnarray}&&C^{\prime} ={\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{out}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\otimes C\otimes {I}_{{B}_{2}^{\mathrm{out}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}})({A}^{T}\otimes {B}^{T})],\end{eqnarray} \tag{ 56 }$

where $C^{\prime}$ the Choi operator of the channel ${ \mathcal S }({ \mathcal A }\otimes { \mathcal B })$ . Oreshkov et al refer to the Choi operator C as a process matrix [41]. For the supermaps that can be implemented by connecting quantum devices in a fixed causal structure, Choi operators C are the same as the quantum combs considered in the previous sections.

Here the operator C acts on the tensor product Hilbert space ${{ \mathcal H }}_{{A}_{1}}^{\mathrm{in}}\otimes {{ \mathcal H }}_{{A}_{1}}^{\mathrm{out}}\otimes {{ \mathcal H }}_{{B}_{1}}^{\mathrm{in}}\otimes {{ \mathcal H }}_{{B}_{1}}^{\mathrm{out}}$ . In order to be the Choi operator of a valid quantum network, the operator C must be positive semidefinite and satisfy a suitable normalization condition—specifically, C should satisfy the condition

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[C(\tilde{A}\otimes \tilde{B})]=1\end{eqnarray} \tag{ 57 }$

for every operators $\tilde{A}$ and $\tilde{B}$ satisfying the conditions

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{out}}}[\tilde{A}]={I}_{{A}_{1}^{\mathrm{in}}}\qquad \mathrm{and}\qquad {\Tr }_{{B}_{1}^{\mathrm{out}}}[\tilde{B}]={I}_{{B}_{1}^{\mathrm{in}}}.\end{eqnarray} \tag{ 58 }$

(See appendix F for the derivation). Physically, this means that the non-causal network ${ \mathcal C }$ deterministically annihilates every pair of local channels $\tilde{A}$ and $\tilde{B}$ , acting on systems ${A}_{1}^{\mathrm{in}}$ , ${A}_{1}^{\mathrm{out}}$ and ${B}_{1}^{\mathrm{in}}$ , ${B}_{1}^{\mathrm{out}}$ , respectively.

Equivalently, the valid networks can be characterized as in the following:

Proposition 8. An operator $C$ is the Choi operator of a non-causal network as in figure 4 if and only if $C$ is positive and $\Tr [{CD}]=1$ for every operator $D$ satisfying the conditions

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{out}}}[D]={I}_{{A}_{1}^{\mathrm{in}}}\otimes \tilde{B},\qquad {\Tr }_{{B}_{1}^{\mathrm{out}}}[\tilde{B}]={I}_{{B}_{1}^{\mathrm{in}}}\end{eqnarray} \tag{ 59 }$

and

$\begin{eqnarray}&&{\Tr }_{{B}_{1}^{\mathrm{out}}}[D]={I}_{{B}_{1}^{\mathrm{in}}}\otimes \tilde{A},\qquad {\Tr }_{{A}_{1}^{\mathrm{out}}}[\tilde{A}]={I}_{{A}_{1}^{\mathrm{in}}},\end{eqnarray} \tag{ 60 }$

with suitable operators $\tilde{A}$ and $\tilde{B}$ .

For the proof, see theorem 2 of [34]. The operator D represents the Choi operator of a no-signaling channel [91–93], that is, a channel that prevents the transmission of information from Alice to Bob and from Bob to Alice. The intuitive idea is that whenever a network can be connected with two local channels, it can also be connected with a no-signaling channel.

In the following we will denote by ${\mathsf{NoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}\,| \,{B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})$ is the set of positive operators satisfying the no-signaling conditions (59) and (60). With this notation, proposition 8 can be reformulated as

Corollary 4. An operator C is the Choi operator of a non-causal network as in figure 4 if and only if

$\begin{eqnarray}&&C\geqslant 0\end{eqnarray} \tag{ 61 }$

$\begin{eqnarray}&&C\in \bar{{\mathsf{NoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})},\end{eqnarray} \tag{ 62 }$

where $\bar{{\mathsf{NoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})}$ is the dual affine space of the set of no-sinalling channels.

We will denote by ${\mathsf{DualNoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})$ the set of operators satisfying conditions (61) and (62). The set ${\mathsf{DualNoSig}}$ is the set containing all the Choi operators of the non-causal networks of actin on pairs of local operatinos.

7.3. The max relative entropy of signaling

In some situations, such as the study of non-causal games [41], it is natural to search for the non-causal networks that maximize a certain figure of merit. For example, consider an experiment where Alice and Bob probe a non-causal network as in figure 4. In their local laboratories, Alice and Bob measure the output systems of the network with the POVMs ${\{{P}_{i}\}}_{i=1}^{K}$ and ${\{{Q}_{j}\}}_{j=1}^{L}$ , respectively, and prepare inputs for the systems, say ρ and σ, respectively. The outcomes i and j are assigned a score $\omega (i,j)$ , which quantifies the performance of the non-causal network. For example, Alice and Bob may want to quantify how much the network correlates their outcomes, corresponding to the score $\omega (i,j)={\delta }_{{ij}}$ . More generally, Alice and Bob can probe the network by preparing correlated states, applying local interactions, and performing local measurements.

Describing the test with a performance operator Ω, the maximum score is achievable by quantum non-causal networks is

$\begin{eqnarray}&&{\omega }_{\max }=\underset{C\in {\mathsf{DualNoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})}{\max }\,\langle {\rm{\Omega }},C\rangle .\end{eqnarray} \tag{ 63 }$

Finding the network that achieves maximum score is similar to finding the entangled state that maximizes the violation of a Bell inequality. The optimization task can be tackled with our theorem 1, which provides a dual expression for the maximum score:

Proposition 9. Let ${\rm{\Omega }}\in {\mathsf{Herm}}({{ \mathcal H }}_{{A}_{1}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{A}_{1}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{B}_{1}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{B}_{1}^{\mathrm{in}}})$ be a generic performance operator a ${\omega }_{\max }$ be the maximum score defined in equation (63). Then, one has

$\begin{eqnarray*}&&{\omega }_{\max }=\underset{{\rm{\Gamma }}\in {\mathsf{NoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}})}{\min }\min \{\lambda \in {\mathbb{R}}\,| \,\,\lambda {\rm{\Gamma }}\geqslant {\rm{\Omega }}\}.\end{eqnarray*}$

When Ω is positive, the maximum score is given by

$\begin{eqnarray}&&{\omega }_{\max }={2}^{{D}_{\max }({\rm{\Omega }}\parallel {\mathsf{NoSig}}({A}_{1}^{\mathrm{in}}\to {A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}}\to {B}_{1}^{\mathrm{out}}))}.\end{eqnarray} \tag{ 64 }$

In words: the maximum score achieved by quantum non-causal networks is determined by the deviation of the performance operator from set of (Choi operators of) no-signaling channels. We call ${D}_{\max }(\,A\,\parallel \,{\mathsf{NoSig}}({A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}}| {B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}))$ the max relative entropy of signaling, in analogy with the relative entropy of entanglement of a state ρ [94–96].

7.4. Optimizing multipartite non-causal networks

The results presented in the bipartite case can be easily generalized to multipartite non-causal networks. Consider a quantum network that can interact with k local devices, by providing an input system to each device and annihilating its output system. As in the bipartite case, the network can be represented by its Choi operator C, which will have to satisfy the condition

$\begin{eqnarray*}&&\Tr [C({\tilde{A}}_{1}\otimes {\tilde{A}}_{2}\otimes \cdots \otimes \,{\tilde{A}}_{k})]=1,\end{eqnarray*}$

for every set of Choi operators $({\tilde{A}}_{1},{\tilde{A}}_{2},\cdots ,\,{\tilde{A}}_{k})$ representing local quantum channels. Equivalently, the normalization condition can be expressed as

$\begin{eqnarray*}&&\Tr [C\,D]=1,\end{eqnarray*}$

for every Choi operator D representing a k-partite no-signaling channel. Specifically, the set of Choi operators representing k-partite no-signaling channels is defined as follows:

Definition 6. An operator D, acting on ${\bigotimes }_{i=1}^{k}({{ \mathcal H }}_{{A}_{i}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{A}_{i}^{\mathrm{in}}})$ , is the Choi operator of a no-signaling channel iff for every subset ${\mathsf{J}}\subseteq \{1,\ldots ,k\}$ one has

$\begin{eqnarray*}&&{\Tr }_{{A}_{{\mathsf{J}}}^{\mathrm{out}}}[D]={I}_{{A}_{{\mathsf{J}}}^{\mathrm{in}}}\otimes {D}_{{\mathsf{J}}}^{c},\end{eqnarray*}$

where ${\Tr }_{{A}_{{\mathsf{J}}}^{\mathrm{out}}}$ is the partial trace over the Hilbert space ${{ \mathcal H }}_{{A}_{{\mathsf{J}}}^{\mathrm{out}}}:= {\bigotimes }_{i\in {\mathsf{J}}}{{ \mathcal H }}_{{A}_{i}^{\mathrm{out}}}$ , ${I}_{{A}_{{\mathsf{J}}}^{\mathrm{in}}}$ is the identity operator on the Hilbert space ${{ \mathcal H }}_{{A}_{{\mathsf{J}}}^{\mathrm{in}}}:= {\bigotimes }_{i\in {\mathsf{J}}}{{ \mathcal H }}_{{A}_{i}^{\mathrm{in}}}$ , and ${D}_{{\mathsf{J}}}^{c}$ is the Choi operator of a quantum channel transforming density matrices on ${{ \mathcal H }}_{{A}_{{\mathsf{J}}}^{c,{in}}}:= {\bigotimes }_{i\not\hspace{-2pt}{\in }{\mathsf{J}}}{{ \mathcal H }}_{{A}_{i}^{\mathrm{in}}}$ into density matrices on ${{ \mathcal H }}_{{A}_{{\mathsf{J}}}^{c,{\rm{out}}}}:= {\bigotimes }_{i\not\hspace{-2pt}{\in }{\mathsf{J}}}{{ \mathcal H }}_{{A}_{i}^{\mathrm{out}}}$ .

We denote the set of k-partite no-signaling channels as ${{\mathsf{NoSig}}}_{k}$ , keeping implicit the specification of the Hilbert spaces.

Like in the bipartite case, it is natural to consider tasks where one has to find the non-causal network that maximizes a score of the form $\omega =\Tr [{\rm{\Omega }}C]$ for some performance operator Ω. The maximum score is then given by

$\begin{eqnarray}&&{\omega }_{\max }=\max \{\,\Tr [{\rm{\Omega }}C]\,| \,C\in \bar{{{\mathsf{NoSig}}}_{k}}\,\}.\end{eqnarray} \tag{ 65 }$

In general, characterizing the dual affine space of the set of no signaling channels is a rather laborious task. Using theorem 1 we can circumvent the problem and express the maximum score as

$\begin{eqnarray}&&{\omega }_{\max }=\underset{D\in {{\mathsf{NoSig}}}_{k}}{\min }\min \{\,\lambda \in {\mathbb{R}}\,| \,\lambda \,D\geqslant {\rm{\Omega }}\,\}\,,\end{eqnarray} \tag{ 66 }$

or, when Ω is positive

$\begin{eqnarray}&&{\omega }_{\max }={2}^{{D}_{\max }({\rm{\Omega }}\parallel {{\mathsf{NoSig}}}_{k})}.\end{eqnarray} \tag{ 67 }$

Again, the performance is determined by the deviation of the performance operator from the set of (Choi operators of) no-signaling channels.

8. The max relative entropy of non-causal networks

Like in the case of causal networks, the max relative entropy between two quantum networks can be related to the max relative entropy of their output states:

Proposition 10. Let ${C}^{(0)}$ and ${C}^{(1)}$ be the Choi operators of two non-causal networks in $\bar{{{\mathsf{NoSig}}}_{k}}$ and let E be the Choi operator of a network of the form

where $S$ is a generic quantum system and the reduced channel

is no-signaling. Then, one has

$\begin{eqnarray}&&{D}_{\max }({C}^{(0)}\,\parallel \,{C}^{(1)})=\underset{E}{\max }\,{D}_{\max }({C}^{(0)}* E\,\parallel \,{C}^{(1)}\ast E),\end{eqnarray} \tag{ 70 }$

where the maximum runs over all networks of the form (69), with arbitrary system $S$ .

The proof is the same as the proof of proposition 7. The above result shows that the max relative entropy between two non-causal networks is equal to the max relative entropy between the output states generated by connecting the networks to the 'no-signaling part' of a quantum channel, as in figure 5.

Like in the causal case, there is a nice connection to one-shot hypothesis testing. Here one can consider the problem of distinguishing between two alternative models of spacetime, resulting into different ways to connect the operations performed in N local laboratories. For example, ${C}^{(0)}$ could describe a null hypothesis of space time where all the events are causally ordered, while ${C}^{(1)}$ could describe an exotic, non-causal space time. Proposition 10 tells us that, in terms of max relative entropy, the distinguishability of two models of spacetime is quantified by the max relative entropy of the corresponding non-causal networks.

9. Applications

In the following we apply our results to four optimization problems involving quantum networks. We will start from the causal case, considering networks that approximately transform a given set of input channels into a target set of output channels. Then, we will move the case of non-causal networks.

9.1. Transforming quantum channels

Consider the following scenario: A black box implements a quantum channel in the set ${\{{{ \mathcal E }}_{x}\}}_{x\in {\mathsf{X}}}$ , where ${\mathsf{X}}$ is an arbitrary index set. The task is to simulate another channel ${{ \mathcal F }}_{x}$ using the channel ${{ \mathcal E }}_{x}$ as a subroutine. For example, the black box could implement a unitary gate U_x and the task could be to build the control-unitary gate [97–100].

$\begin{eqnarray*}&&{\mathtt{ctrl}}-{U}_{x}=I\otimes | 0\rangle \langle 0| +{U}_{x}\otimes | 1\rangle \langle 1| .\end{eqnarray*}$

To simulate the desired channel ${{ \mathcal F }}_{x}$ , we insert the input channel ${{ \mathcal E }}_{x}$ into a quantum causal network, as in the following diagram

where ${{ \mathcal C }}_{1}$ and ${{ \mathcal C }}_{2}$ are suitable quantum channels. The Choi operator of the output channel ${{{ \mathcal E }}_{x}^{\prime }}_{}$ is then given by

$\begin{eqnarray}&&{E}_{x}^{\prime }=C\ast {E}_{x},\end{eqnarray} \tag{ 72 }$

where C is the Choi operator of the network and * denotes the link product.

Let us focus on the case where the target channel ${{ \mathcal F }}_{x}$ is an isometry, namely ${{ \mathcal F }}_{x}={V}_{x}\cdot {V}_{x}^{\dagger }$ , with ${V}_{x}^{\dagger }{V}_{x}=I$ . To measure how close the channel ${{ \mathcal E }}_{x}^{\prime }$ is to the target, we use the channel fidelity [101–103], given by

$\begin{eqnarray}&&F({ \mathcal E }{}_{x}^{\prime },{{ \mathcal F }}_{x}):= \displaystyle \frac{1}{{d}_{0}^{2}} \langle\langle {V}_{x}| E{}_{x}^{\prime }| {V}_{x} \rangle\rangle ,\end{eqnarray} \tag{ 73 }$

where d₀ is the dimension of the input system A₀ and the notation $| V \rangle\rangle$ denotes the unnormalized state

$\begin{eqnarray*}&&| V \rangle\rangle := (V\otimes I)\,| I \rangle\rangle ,\qquad | I \rangle\rangle := \displaystyle \sum _{n=1}^{d}\,| n\rangle | n\rangle .\end{eqnarray*}$

In this case, the fidelity can be interpreted as the probability that the network passes a test, where the channel ${ \mathcal E }{}_{x}^{\prime }$ is applied locally on one part of an entangled state and the output is tested with a POVM containing the projector on the entangled state $| V \rangle\rangle /\sqrt{{d}_{0}}$ . The fidelity can be expressed as

$\begin{eqnarray*}&&F({ \mathcal E }{}_{x}^{\prime },{{ \mathcal F }}_{x}):= \displaystyle \frac{1}{{d}_{0}^{2}}\Tr [C(\,| {V}_{x} \rangle\rangle \langle\langle {V}_{x}| \otimes {E}_{x}^{T})].\end{eqnarray*}$

Now, if the input channel ${{ \mathcal E }}_{x}$ is given with prior probability $p(x)$ , the average channel fidelity is given by

$\begin{eqnarray}F & = & \displaystyle \sum _{x}\,\ p(x)\,F({{ \mathcal E }}_{x}^{\prime} ,{{ \mathcal F }}_{x})\\ & = & \Tr [{\rm{\Omega }}\,C],\qquad {\rm{\Omega }}:= \displaystyle \sum _{x}\,p(x)(\,| {V}_{x} \rangle\rangle \langle\langle {V}_{x}| \otimes {E}_{x}^{T}).\end{eqnarray} \tag{ 74 }$

Thanks to theorem (2), the maximum fidelity can be expressed as

$\begin{eqnarray}&&{F}_{\max }=\underset{{\rm{\Gamma }}\in {\mathsf{DualComb}}}{\min }\{\lambda \in {\mathbb{R}}\,| \,\lambda \,{\rm{\Gamma }}\geqslant {\rm{\Omega }}\},\end{eqnarray} \tag{ 75 }$

where ${\mathsf{DualComb}}$ is the set of positive operators on ${{ \mathcal H }}_{3}\otimes {{ \mathcal H }}_{2}\otimes {{ \mathcal H }}_{1}\otimes {{ \mathcal H }}_{0}$ satisfying the conditions

$\begin{eqnarray}{\rm{\Gamma }} & = & {I}_{3}\otimes {T}_{210},\\ {\Tr }_{2}[{T}_{210}] & = & {I}_{1}\otimes {T}_{0},\\ \Tr [{T}_{0}] & = & 1.\end{eqnarray} \tag{ 76 }$

In the following we illustrate the use of this expression in a few examples.

9.2. Optimal inversion of an unknown unitary dynamics

Unitary quantum dynamics is, by definition, invertible: given a classical description of a unitary gate U, in principle one can always engineer the gate ${U}^{\dagger }$ implementing the inverse physical process. However, the situation is different when the gate U is unknown. Can we devise a physical inversion mechanism, which transforms every unknown unitary dynamics U, given as a black box, into its inverse? Classically, the analogue of inverting a unitary dynamics is inverting a permutation. Inverting a permutation with a single evaluation is clearly impossible, because evaluating the permutation allows us to know its action on one input at most, and this information is not sufficient to perform an inversion on the other inputs. In the quantum domain, the situation is more interesting, because one use of a unitary gate is enough to store it faithfully into a quantum memory, by applying U on one side of a maximally entangled state. A first question is whether the information stored in the memory can be extracted and used to implement the inverse gate ${U}^{\dagger }$ . Interestingly, this possibility is barred by Nielsen's and Chuang's no-programming theorem [104], which states that only orthogonal states can be used to program unitary gates deterministically and without error. As an alternative, one can try to think of protocols that simulate ${U}^{\dagger }$ with one use of U, without storing U in a quantum memory. Protocols of this form are implemented by quantum networks as in equation (71). We now show that even such protocols cannot implement a perfect inversion. More specifically, we now show that the best way to generate the inverse of an unknown dynamics is simply to estimate it and to use the estimate to implement an approximate inversion. Our result highlights an analogy between the optimal inversion of an unknown unitary dynamics and the optimal universal NOT (UNOT) gate [105, 106], the quantum channel that attempts to transform every pure quantum state into its orthogonal complement. A known fact is that no quantum channel can approximate the ideal UNOT gate better than a channel that measures the input state and produces an orthogonal state based on the measurement outcome [105, 106]. Considering this feature, one can think of the unitary inversion as the analogue of the UNOT: they are both involutions and they both are implemented optimally by measure-and-prepare strategies.

Let us assume that the unknown unitary gate U is drawn at random according to the normalized Haar measure ${\rm{d}}U$ . Then, the performance operator in equation (74) takes the form

$\begin{eqnarray}&&{\rm{\Omega }}=\displaystyle \frac{1}{{d}^{2}}\int {\rm{d}}U\,| {U}^{\dagger } \rangle\rangle \langle\langle {U}^{\dagger }{| }_{30}\otimes | \bar{U} \rangle\rangle \langle\langle \bar{U}{| }_{21},\end{eqnarray} \tag{ 77 }$

with $d={d}_{0}={d}_{1}={d}_{2}={d}_{3}$ . The evaluation of the fidelity is provided in appendix G, where we obtain the value

$\begin{eqnarray}&&{F}_{\max }=\displaystyle \frac{2}{{d}^{2}}.\end{eqnarray} \tag{ 78 }$

Now, it turns out that the maximum fidelity can be achieved through the estimation of the gate U. Indeed, the optimal strategy for gate estimation is to prepare a maximally entangled state, to apply the unknown gate U on one side, and to perform the POVM ${P}_{\widehat{U}}=d\,| \widehat{U} \rangle\rangle \langle\langle \widehat{U}|$ [107]. This strategy leads to the conditional probability distribution

$\begin{eqnarray*}&&p(\widehat{U}| U)={| \Tr [{U}^{\dagger }\widehat{U}]| }^{2},\end{eqnarray*}$

normalized with respect to the Haar measure. Averaging the channel fidelity $F(\hat{U},U)={| \Tr [{U}^{\dagger }\widehat{U}]| }^{2}/{d}^{2}$ , we then obtain the value

$\begin{eqnarray}{F}_{\mathrm{est}}(U) & = & \displaystyle \int {\rm{d}}\widehat{U}{| \Tr [{U}^{\dagger }\widehat{U}]| }^{4}/{d}^{2}\\ & = & 2/{d}^{2}\\ & \equiv & {F}_{\max },\qquad \forall U\in {\mathsf{SU}}(d).\end{eqnarray} \tag{ 79 }$

The continuous POVM with ${P}_{\hat{U}}=d| \hat{U} \rangle\rangle \langle\langle \hat{U}|$ can also be replaced by a discrete Bell measurement, with d² outcomes, without affecting the fidelity in the worst case scenario, or equivalently, the average fidelity over all unitaries. One way or another, the above discussion proves that no quantum network can invert a gate better than a classical network that generates the inverse by using gate estimation as an intermediate step.

9.3. Simulating the evolution of a charge conjugate particle

In quantum mechanics, complex conjugation implements the symmetry between particles and antiparticles. If the evolution of a quantum particle is described by the unitary transformation U, then the evolution of the corresponding antiparticle will be described by the unitary transformation $\bar{U}$ , where each matrix element is replaced by its complex conjugate. Consider the scenario where one is given a black box that performs a unitary transformation on a certain particle. Can we use this black box to simulate the evolution of the corresponding antiparticle? Physically, the most general simulation strategy is described by a quantum network as in equation (71).

For the charge conjugation problem, the performance operator Ω reads

$\begin{eqnarray}{\rm{\Omega }} & = & \displaystyle \frac{1}{{d}^{2}}\displaystyle \int {\rm{d}}U\,| \bar{U} \rangle\rangle \langle\langle \bar{U}{| }_{30}\otimes | \bar{U} \rangle\rangle \langle\langle \bar{U}{| }_{21}\\ & & \displaystyle \frac{1}{{d}^{2}}\left(\displaystyle \frac{{P}_{+,32}\otimes {P}_{+,10}}{{d}_{+}}+\displaystyle \frac{{P}_{-,32}\otimes {P}_{-,10}}{{d}_{-}}\right)\ ,\end{eqnarray} \tag{ 80 }$

where ${P}_{+}$ and ${P}_{-}$ ( ${d}_{+}$ and ${d}_{-}$ ) are the projectors on (the dimensions of) the symmetric and antisymmetric subspaces, respectively. In appendix H we evaluate the dual expression equation (75), obtaining the maximum fidelity

$\begin{eqnarray*}&&{F}_{\max }=\displaystyle \frac{2}{d(d-1)}.\end{eqnarray*}$

Note that the fidelity is equal to 1 in the case of two-dimensional quantum systems. This is consistent with the fact that, for $d=2$ , the matrices U and $\bar{U}$ are unitarily equivalent—specifically, $\bar{U}={YUY}$ , where Y is the Pauli matrix $Y:= \left(\begin{array}{cc}0 & -i\\ i & 0\end{array}\right)$ . Therefore, one can implement the complex conjugation by sandwiching the original unitary between two Pauli gates.

For systems of large dimension, the fidelity converges to $2/{d}^{2}$ , the value achieved by gate estimation (see equation (79) in the previous paragraph). This means that gate estimation is asymptotically the optimal strategy, but, remarkably, it is not the optimal strategy when d is finite. The optimal simulation of the charge conjugate dynamics is achieved by the network with Choi operator

$\begin{eqnarray*}&&C=\displaystyle \frac{d\,{P}_{-,32}}{{d}_{-}}\otimes \displaystyle \frac{d\,{P}_{-,10}}{{d}_{-}}.\end{eqnarray*}$

It is immediate to verify that, indeed, the operator C satisfies the normalization constraints and that one has $\Tr [{\rm{\Omega }}C]=1/{d}_{-}={F}_{\max }$ . Physically, C represents a 'disconnected network' of the form

consisting of two subsequent uses of the channel ${ \mathcal K }$ with Choi operator $K:= d\,{P}_{-}/{d}_{-}$ . When the input gate U is inserted in the open slot, the overall evolution from system A₀ to system A₃ is given by the channel ${ \mathcal F }^{\prime} ={ \mathcal K }\,{ \mathcal U }\,{ \mathcal K }$ , which optimally simulates the charge conjugate evolution $\bar{U}$ .

It is interesting to further elaborate on the physical meaning of the operations in the network. At first, one may guess that the optimal way to conjugate an unknown unitary U is to approximate the sequence of transformations

$\begin{eqnarray}&&\rho \quad \overset{\mathrm{transpose}}{\longrightarrow }\quad {\rho }^{T}\quad \overset{U}{\longrightarrow }\quad U{\rho }^{T}{U}^{\dagger }\quad \overset{\mathrm{transpose}}{\longrightarrow }\quad \bar{U}\rho {U}^{T}.\end{eqnarray} \tag{ 81 }$

As the transpose is not a physical operation, one may try to use the optimal transpose channel [108–112], which has Choi operator $T={{dP}}_{+}/{d}_{+}$ . However, this choice would be suboptimal, leading to the fidelity

$\begin{eqnarray*}&&{F}_{\mathrm{transpose}}=1/{d}_{+}=2/[d(d+1)]\lt {F}_{\max }.\end{eqnarray*}$

Instead, the optimal strategy is to approximate the transpose ${\mathtt{NOT}}$ , i.e. the impossible transformation that maps every projector into its orthogonal complement. In the Heisenberg picture, the transpose ${\mathtt{NOT}}$ maps every observable A into the observable $I-{A}^{T}$ , allowing us to reproduce the charge conjugate dynamics as

$\begin{eqnarray*}&&A\quad \overset{\,\mathrm{transpose}\,{\mathtt{NOT}}\,}{\longrightarrow }\quad I-{A}^{T}\quad \overset{U}{\longrightarrow }\quad I-{U}^{\dagger }{A}^{T}U\quad \overset{\,\mathrm{transpose}\,{\mathtt{NOT}}\,}{\longrightarrow }\quad {U}^{T}A\bar{U}.\end{eqnarray*}$

It turns out that the optimal approximation of the transpose ${\mathtt{NOT}}$ is exactly the channel ${ \mathcal K }$ used in our network: in summary, the optimal simulation of the charge conjugate dynamics employs the optimal transpose ${\mathtt{NOT}}$ instead of the optimal transpose. Some intuition to justify this bizarre fact comes from the observation that the optimal transpose can be implemented via state estimation and, therefore, approximating the sequence (81) would lead to a classical, estimation-based strategy. Instead, the transpose ${\mathtt{NOT}}$ cannot be achieved via state estimation. For example, the transpose ${\mathtt{NOT}}$ for qubits is a unitary transformation, corresponding to the Pauli matrix Y.

9.4. Optimal controlization of unknown gates

Given a unitary gate U, the corresponding control unitary gate is

$\begin{eqnarray*}&&{\mathtt{ctrl}}-U:= I\otimes | 0\rangle \langle 0| +U\otimes | 1\rangle \langle 1| ,\end{eqnarray*}$

where $| 0\rangle$ and $| 1\rangle$ are the states of a qubit acting as control system. Controlization is the task of transforming an unknown gate U, accessed as a black box, into the corresponding gate ${\mathtt{ctrl}}-U$ .

When U is an arbitrary unitary, perfect controlization is impossible, as it was recently shown in [97, 98]. Like the no-cloning Theorem, this 'no-controlization' result establishes the impossibility of a perfect functionality. But what about approximate controlization? A priori, nothing forbids that one could engineer an approximate controlization protocol that achieves high-fidelity, almost circumventing the no-go Theorem. In the following we show that this is not the case. For a completely unknown unitary gate U, we show that not only is perfect controlization impossible, but also that every quantum strategy for controlization will be at most as good as a classical strategy that measures the control qubit and performs the gate U or the identity depending on the measurement outcome.

For the controlization task, the performance operator Ω reads

$\begin{eqnarray}&&{\rm{\Omega }}=\displaystyle \frac{1}{2{d}^{2}}\int {\rm{d}}U\,| {\mathtt{ctrl}}-U\rangle \rangle \langle \langle {\mathtt{ctrl}}-U| \otimes | \bar{U}\rangle \rangle \langle \langle \bar{U}| .\end{eqnarray} \tag{ 82 }$

The evaluation of the maximum fidelity, carried out in appendix I, yields the optimal fidelity

$\begin{eqnarray*}&&{F}_{\max }=\displaystyle \frac{1}{2}\,.\end{eqnarray*}$

By direct inspection, one can check that this is the same fidelity achieved by a network that measures the control qubit in the computational basis $\{| 0\rangle ,| 1\rangle \}$ and applies the unknown gate U when the outcome is 1. Specifically, such strategy turns the input gate U into the classically-controlled channel ${{ \mathcal C }}_{U}$ defined by

$\begin{eqnarray*}&&{{ \mathcal C }}_{U}(\rho \otimes \sigma ):= \langle 0| \sigma | 0\rangle \,\rho +\langle 1| \sigma | 1\rangle \,U\rho {U}^{\dagger },\end{eqnarray*}$

where ρ is an arbitrary state of the system and σ is an arbitrary state of the control qubit. It is immediate to check that the fidelity between the classically-controlled channel ${{ \mathcal C }}_{U}$ and the control-unitary gate is 1/2 for every unitary. The above argument shows that no quantum circuit can perform better than a classical circuit where the control qubit is decohered by a measurement.

9.5. Maximization of the payoff in a non-causal quantum game

Here we consider the non-causal game introduced by Oreshkov et al in [41]. The game involves two spatially separated parties, Alice and Bob, and a referee, who sends inputs to and receives outputs from the players. Specifically, the referee sends an input bit a to Alice and two input bits b and $b^{\prime}$ to Bob. Then, the referee demands one output bit x from Alice and one output bit y from Bob. The referee assigns a score $\omega (x,y| a,b,b^{\prime} )$ , given by

$\begin{eqnarray}\omega (x,y| a,b,0)=\left\{\begin{array}{ll}1 & \qquad x=b,\\ 0 & \qquad x\ne b,\end{array}\right.\qquad \mathrm{and}\qquad \omega (x,y| a,b,1)=\left\{\begin{array}{ll}1 & \qquad y=a,\\ 0 & \qquad y\ne a.\end{array}\right.\end{eqnarray} \tag{ 83 }$

In this game, Alice and Bob are not subject to the no-signaling constraint. In principle, Alice may be able to communicate to Bob, or vice-versa. The only constraint is that Alice and Bob can interact only through a fixed network, which allows for communication at most in one-way: either from Alice to Bob, or from Bob to Alice.

It is interesting to see how quantum resources can help Alice and Bob. The most general quantum resource is described by a network that connects Alice's operations to Bob's operations. The network will provide inputs ${A}^{\mathrm{in}}$ and ${B}^{\mathrm{in}}$ to Alice and Bob, respectively. Alice and Bob then perform local operations, transforming systems ${A}^{\mathrm{in}}$ and ${B}^{\mathrm{in}}$ them into systems ${A}^{\mathrm{out}}$ and ${B}^{\mathrm{out}}$ . The local operations depend on the inputs a and $(b,b^{\prime} )$ and will generate the outputs x and y, respectively. Diagrammatically, this scenario is depicted in figure 6.

**Figure 6.** The quantum operations ${{ \mathcal A }}_{x}^{a}$ and ${{ \mathcal B }}_{y}^{b,b^{\prime} }$ in Alice's and Bob's laboratory interact through a non-causal network ${ \mathcal C }$ . The operations act on the Hilbert spaces ${{ \mathcal H }}_{{A}^{\mathrm{in}},{A}^{\mathrm{out}}}$ and ${{ \mathcal H }}_{{B}^{\mathrm{in}},{B}^{\mathrm{out}}}$ respectively. The network creates the input systems ${A}^{\mathrm{in}}$ and ${B}^{\mathrm{in}}$ and annihilates the output systems ${A}^{\mathrm{out}}$ and ${B}^{\mathrm{out}}$ .
Download figure:
Standard image High-resolution image

**Figure 6.** The quantum operations ${{ \mathcal A }}_{x}^{a}$ and ${{ \mathcal B }}_{y}^{b,b^{\prime} }$ in Alice's and Bob's laboratory interact through a non-causal network ${ \mathcal C }$ . The operations act on the Hilbert spaces ${{ \mathcal H }}_{{A}^{\mathrm{in}},{A}^{\mathrm{out}}}$ and ${{ \mathcal H }}_{{B}^{\mathrm{in}},{B}^{\mathrm{out}}}$ respectively. The network creates the input systems ${A}^{\mathrm{in}}$ and ${B}^{\mathrm{in}}$ and annihilates the output systems ${A}^{\mathrm{out}}$ and ${B}^{\mathrm{out}}$ .
Download figure:
Standard image High-resolution image

Mathematically, the operations are described by two quantum instruments ${\{{{ \mathcal M }}_{x}^{a}\}}_{x=\mathrm{0,1}}$ and ${\{{{ \mathcal N }}_{y}^{b,b^{\prime} }\}}_{y=\mathrm{0,1}}$ . With these settings, the probability distribution of the outputs is given by

$\begin{eqnarray*}&&p(x,y| a,b,b^{\prime} )=\Tr [({M}_{x}^{a}\otimes {N}_{y}^{b,b^{\prime} })\,C],\end{eqnarray*}$

where $\{{M}_{x}^{a}\}{}_{x=\mathrm{0,1}}$ and $\{{N}_{y}^{b,b^{\prime} }\}{}_{y=\mathrm{0,1}}$ are the Choi operators of Alice's and Bob's instruments, respectively, and C is the Choi operator of the network that mediates the interaction.

With this settings, the average score is given by

$\begin{eqnarray*}\omega & = & \displaystyle \frac{1}{8}\,\displaystyle \sum _{a,b,b^{\prime} ,x,y}\,\omega (x,y| a,b,b^{\prime} )\,p(x,y| a,b,b^{\prime} )\\ & = & \Tr [{\rm{\Omega }}\,C],\end{eqnarray*}$

where Ω is the performance operator

$\begin{eqnarray}&&{\rm{\Omega }}:= \displaystyle \frac{1}{8}\,\displaystyle \sum _{a,b,b^{\prime} ,x,y}\,\omega (x,y| a,b,b^{\prime} )({M}_{x}^{a}\otimes {N}_{y}^{b,b^{\prime} }).\end{eqnarray} \tag{ 84 }$

The main result by Oreshkov et al is that the average score is upper bounded as $\omega \leqslant 3/4$ whenever the network C has a definite causal order, whereas there exists a non-causal network ${C}_{* }$ and local operations ${\{{{ \mathcal M }}_{x* }^{a}\}}_{x=\mathrm{0,1}}$ and ${\{{{ \mathcal N }}_{y* }^{b,b^{\prime} }\}}_{y=\mathrm{0,1}}$ that achieve score

$\begin{eqnarray}&&{\omega }_{* }=\displaystyle \frac{1}{2}\left(1+\displaystyle \frac{1}{\sqrt{2}}\right).\end{eqnarray} \tag{ 85 }$

Specifically, the score ${\omega }_{* }$ is achieved by choosing systems ${A}^{\mathrm{in}},{B}^{\mathrm{in}},{A}^{\mathrm{out}},{B}^{\mathrm{out}}$ to be qubits and by choosing the local operations with Choi operators

$\begin{eqnarray}{M}_{x* }^{a} & = & \displaystyle \frac{1}{4}{[I+{(-1)}^{x}{\sigma }_{z}]}_{{A}^{\mathrm{in}}}\otimes {[I+{(-1)}^{a}{\sigma }_{z}]}_{{A}^{\mathrm{out}}},\\ {N}_{y* }^{b,b^{\prime} } & = & \,b^{\prime} \left[\displaystyle \frac{1}{2}{[I+{(-1)}^{y}{\sigma }_{z}]}_{{B}^{\mathrm{in}}}\otimes {\rho }_{{B}^{\mathrm{out}}}\right\}\\ & & +(b^{\prime} \oplus 1)\left\{\displaystyle \frac{1}{4}{[I+{(-1)}^{y}{\sigma }_{x}]}_{{B}^{\mathrm{in}}}\otimes {[I+{(-1)}^{b+y}{\sigma }_{z}]}_{{B}^{\mathrm{out}}}\right\}.\end{eqnarray} \tag{ 86 }$

where $\oplus$ denotes the addition modulo 2 and ${\rho }_{{B}^{\mathrm{out}}}$ is a fixed quantum state on Bob's output, which can be chosen to be the maximally mixed state without loss of generality.

The score ω can be regarded as a measure of the non-causality of the network mediating the interactions between Alice and Bob. An interesting question is whether ${\omega }_{* }$ is the maximum score attainable when Alice's and Bob's instruments (86) are connected by an arbitrary non-causal network. This question has been indirectly answered by Brukner [54], who considered a more general scenario, wherein Alice's and Bob's local operations are also subject to optimization. Brukner showed that the payoff ${\omega }_{* }=(1+1/\sqrt{2})/2$ is maximum over all non-causal networks and over a certain class of two-outcome instruments on Alice's and Bob's side, allowing Alice's and Bob's systems to have generic dimensions. When Alice's and Bob's operations are fixed to the qubit operations (86) used in the original paper [41], we now present an alternative (and comparatively shorter) optimality proof for the value ${\omega }_{* }=(1+1/\sqrt{2})/2$ . This result serves as an illustration of the SDP method, which provides here a nice and straightforward solution.

Inserting equation (86) into equation (84) we obtain the performance operator

$\begin{eqnarray*}&&{\rm{\Omega }}=\displaystyle \sum _{i,j,k}\,| i\rangle \langle i{| }_{{A}^{\mathrm{in}}}\otimes | j\rangle \langle j{| }_{{A}^{\mathrm{out}}}\otimes {{\rm{\Omega }}}_{{ijk}}\otimes | k\rangle \langle k{| }_{{B}^{\mathrm{out}}},\end{eqnarray*}$

where ${{\rm{\Omega }}}_{{ijk}}$ are operators acting on B₁ and are defined as

$\begin{eqnarray*}{{\rm{\Omega }}}_{000} & = & \displaystyle \frac{1}{8}(| +\rangle \langle +| +| 0\rangle \langle 0| ),\qquad \qquad {{\rm{\Omega }}}_{001}=\displaystyle \frac{1}{8}(| -\rangle \langle -| +| 0\rangle \langle 0| ),\\ {{\rm{\Omega }}}_{010} & = & \displaystyle \frac{1}{8}(| +\rangle \langle +| +| 1\rangle \langle 1| ),\qquad \qquad {{\rm{\Omega }}}_{011}=\displaystyle \frac{1}{8}(| -\rangle \langle -| +| 1\rangle \langle 1| ),\\ {{\rm{\Omega }}}_{100} & = & \displaystyle \frac{1}{8}(| -\rangle \langle -| +| 0\rangle \langle 0| ),\qquad \qquad {{\rm{\Omega }}}_{101}=\displaystyle \frac{1}{8}(| +\rangle \langle +| +| 0\rangle \langle 0| ),\\ {{\rm{\Omega }}}_{110} & = & \displaystyle \frac{1}{8}(| -\rangle \langle -| +| 1\rangle \langle 1| ),\qquad \qquad {{\rm{\Omega }}}_{111}=\displaystyle \frac{1}{8}(| +\rangle \langle +| +| 1\rangle \langle 1| ).\end{eqnarray*}$

Now, the dual optimization problem is to find the minimum λ such that $\lambda \,{\rm{\Gamma }}\geqslant {\rm{\Omega }}$ , for some Choi operator Γ representing a no-signaling channel. The key observation is that all the ${{\rm{\Omega }}}_{{ijk}}$ have the same maximum eigenvalue, equal to ${e}_{\max }=1/8(1+1/\sqrt{2})$ . As a result, we can satisfy the dual constraint by setting $\lambda =1/2(1+1/\sqrt{2})$ and ${\rm{\Gamma }}={I}_{{A}^{\mathrm{in}}{A}^{\mathrm{out}}{B}^{\mathrm{in}}{B}^{\mathrm{out}}}/4$ . Note that Γ is the Choi operator of a no-signaling channel, as it satisfies equations (59) and (60). Hence, we obtain the bound

$\begin{eqnarray}&&\omega \leqslant \displaystyle \frac{1}{2}\left(1+\displaystyle \frac{1}{\sqrt{2}}\right),\end{eqnarray} \tag{ 87 }$

valid for every non-causal network. The bound can be achieved, since r.h.s. matches the value in equation (85).

10. Conclusions

We developed a SDP method for the optimization of quantum networks. The method can be applied to causal networks as well as more general networks with indefinite causal structure. For a large class of optimization problems, we observed that the maximum performance can be expressed in terms of a max relative entropy. Building on this fact, we extended the notions of conditional min-entropy and max relative entropy from quantum states to quantum networks. Specifically, the relative entropy between two networks can be characterized as the maximum of the relative entropy between the states that can be generated by the two networks. Similarly, the min-entropy of a quantum causal network can be characterized as the maximum min-entropy that the network can build up by interacting over time with a sequence of quantum devices. Intuitively, the network min-entropy can be regarded as a measure of the amount of quantum correlations generated over a sequence of time steps.

Our results have applications to a number of scenarios, including e. g. the optimization of algorithms for quantum causal discovery [28], tomography of quantum channels and causal networks [18, 69, 113, 114], and quantum machine learning [115–117]. Another stimulating avenue of future research is on the quantum engineering side, where our method can be adapted to deal with optimization tasks in the presence of limited energy resources. For example, it is interesting to explore the causal networks that can be implemented at zero-energy cost, extending to the network scenario the results obtained in [118] for individual state transitions. The interesting aspect here is the possibility to borrow energy resources at a certain time and to return them at later times, resulting in an overall zero energy balance. As a further step, the extension from quantum networks working in the zero-energy regime to network using bounded energy resources is even more compelling in view of future applications. Exploring how energy and coherence across energy eigenstates can be optimally allocated within a distributed system is expected to unveil new quantum advantages, leading to a new layer of optimization in the design of quantum technologies.

Acknowledgments

We acknowledge the referees of this paper for useful comments that helped improving the presentation and strengthening our results. The research of our group is supported by the Foundational Questions Institute (FQXi-RFP3-1325 and FQXi-MGA-1502), the National Natural Science Foundation of China through Grant No. 11675136, the Hong Kong Research Grant Council through Grant No.17326616, the Canadian Institute for Advanced Research, the HKU Seed Funding for Basic Research, and the John Templeton Foundation. GC is grateful to F Buscemi and YC Liang for useful discussions and to A Acín, M Hoban, and R Chavez for organizing the workshop 'Quantum Networks' Barcelona, 30 March–1 April 2016, which offered the occasion for a stimulating exchange of ideas that benefitted this paper.

Appendix A.: Proof of theorem 1

Proof. By definition, the value of the primal problem is given by

$\begin{eqnarray*}\begin{array}{rcl}{\omega }_{\mathrm{primal}} & = & \sup \{\langle A,X\rangle \,| \,X\geqslant 0,X\in {\mathsf{S}}\}\\ & = & \sup \{\langle A,X\rangle \,| \,X\geqslant 0,X\in \bar{\bar{{\mathsf{S}}}}\}\\ & = & \sup \{\langle A,X\rangle \,| \,X\geqslant 0,\langle {\rm{\Gamma }},X\rangle =1,\forall {\rm{\Gamma }}\in \bar{{\mathsf{S}}}\},\end{array}\end{eqnarray*}$

having used the relation ${\mathsf{S}}=\bar{\bar{{\mathsf{S}}}}$ . Now, let us pick an affine basis for $\bar{{\mathsf{S}}}$ , say $({{\rm{\Gamma }}}_{i}{)}_{i=1}^{K}$ and re-write the value of the primal problem as

$\begin{eqnarray*}&&{\omega }_{\mathrm{primal}}=\sup \{\langle A,X\rangle \,| \,X\geqslant 0,\langle {{\rm{\Gamma }}}_{i},X\rangle =1,\,\forall i\in \{1,\ldots ,K\}\}.\end{eqnarray*}$

Weak duality then yields the relation

$\begin{eqnarray}&&{\omega }_{\mathrm{primal}}\leqslant \inf \left\{\sum _{i=1}^{K}{\lambda }_{i}\,| \,{\lambda }_{i}\in {\mathbb{R}},\displaystyle \sum _{i}{\lambda }_{i}\,{{\rm{\Gamma }}}_{i}\geqslant A\right\}\end{eqnarray} \tag{ A.1 }$

$\begin{eqnarray}&&\leqslant \,\inf \left\{\sum _{i=1}^{K}{\lambda }_{i}\,| \,{\lambda }_{i}\in {\mathbb{R}},\displaystyle \sum _{i}{\lambda }_{i}\,{{\rm{\Gamma }}}_{i}\geqslant A,\displaystyle \sum _{i}{\lambda }_{i}\ne 0\right\}\end{eqnarray} \tag{ A.2 }$

$\begin{eqnarray}&&=\,\underset{{\rm{\Gamma }}\in \bar{{\mathsf{S}}}}{\inf }\,\min \{\lambda \in {\mathbb{R}}\,| \,\lambda {\rm{\Gamma }}\geqslant A\},\end{eqnarray} \tag{ A.3 }$

having defined $\lambda := {\sum }_{i}\,{\lambda }_{i}$ and ${\rm{\Gamma }}:= {\sum }_{i}{\lambda }_{i}{{\rm{\Gamma }}}_{i}/\lambda$ .

Now, suppose that ${\mathsf{S}}$ contains a positive operator X₀ and $\bar{{\mathsf{S}}}$ contains a strictly positive operator ${{\rm{\Gamma }}}_{0}$ , then Slater's theorem implies the equality: indeed, one can choose the affine basis $({{\rm{\Gamma }}}_{i}{)}_{i=1}^{K}$ to contain the operator ${{\rm{\Gamma }}}_{0}$ . Since ${{\rm{\Gamma }}}_{0}$ is strictly positive, one can find strictly positive coefficients $({\lambda }_{i}{)}_{i=1}^{K}$ such that $\sum _{i}\,{\lambda }_{i}\,{{\rm{\Gamma }}}_{i}\geqslant A$ . This means that the dual problem in the rhs of equation (A.1) admits a strictly positive solution. Hence, proposition 4 implies the equality in equation (A.1). The equality holds also in equation (A.2), because every solution with ${\sum }_{i}{\lambda }_{i}=0$ can be replaced by a new solution with ${\sum }_{i}{\lambda }_{i}^{\prime }=\epsilon$ , by substituting ${\lambda }_{1}$ with ${\lambda }_{1}+\epsilon$ , $\epsilon \gt 0$ . Since can be arbitrarily small, this substitution does not change the value of the infimum. If A is positive, then one has the lower bound ${\omega }_{\mathrm{primal}}\geqslant \langle A,{X}_{0}\rangle \geqslant 0$ . Equation (A.3) then implies that every λ satisfying $\lambda {\rm{\Gamma }}\geqslant A$ , ${\rm{\Gamma }}\in \bar{{\mathsf{S}}}$ must be non-negative. If λ is strictly positive, the operator Γ must be positive. If $\lambda =0$ , the operator Γ can be chosen to be positive without loss of generality. In conclusion, the infimum in equation (A.3) can be restricted to ${\bar{{\mathsf{S}}}}_{+}$ . Setting $w:= 1/\lambda$ one finally obtains the desired expression.□

Appendix B.: Proof of theorem 2

Proof. The maximum performance is given by equation (21). The expression can be re-written as

$\begin{eqnarray}&&{\omega }_{\max }:= \max \{\langle {\rm{\Omega }},C\rangle \,| \,C\in {\mathsf{S}},C\geqslant 0\},\end{eqnarray} \tag{ B.1 }$

where ${\mathsf{S}}$ is the affine space of all the operators on ${\bigotimes }_{j=1}^{N}({{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})$ that are Hermitian and satisfy the linear constraint (12). Note that ${\mathsf{S}}$ contains the strictly positive operator

$\begin{eqnarray}&&{C}_{0}={I}_{0}\otimes {I}_{1}\otimes \cdots \,\otimes \,{I}_{2N-1}/({d}_{1}{d}_{3}...{d}_{2N-1})\end{eqnarray} \tag{ B.2 }$

and the dual affine space $\bar{{\mathsf{S}}}$ contains the strictly positive operator

$\begin{eqnarray}&&{{\rm{\Gamma }}}_{0}={I}_{0}\otimes {I}_{1}\otimes \cdots \,\otimes \,{I}_{2N-1}/({d}_{0}{d}_{2}...{d}_{2N-2}).\end{eqnarray} \tag{ B.3 }$

Since the sets ${\mathsf{S}}$ and $\bar{{\mathsf{S}}}$ contain strictly positive operators, the expression in theorem 1 holds with the equality. Moreover, one can choose the performance operator Ω to be positive without loss of generality: if Ω is not positive, one can define ${\rm{\Omega }}^{\prime} ={\rm{\Omega }}+c{{\rm{\Gamma }}}_{0}$ , where c is a positive constant and ${{\rm{\Gamma }}}_{0}$ is the operator in equation (B.3). This substitution only shifts the primal and dual values by the constant c, while preserving the optimal solutions. For the shifted problem, theorem 1 guarantees that the dual optimization can be restricted to the positive operators in ${\bar{{\mathsf{S}}}}_{+}$ , namely

$\begin{eqnarray*}&&{\omega }_{\max }^{\prime }=\underset{{\rm{\Gamma }}\in {\bar{{\mathsf{S}}}}_{+}}{\inf }\min \{\lambda \in {\mathbb{R}}\,| \,\lambda {\rm{\Gamma }}\geqslant {\rm{\Omega }}^{\prime} \}.\end{eqnarray*}$

Now, the set ${\bar{{\mathsf{S}}}}_{+}$ has been characterized in [18]: precisely, ${\bar{{\mathsf{S}}}}^{+}$ is the set of all positive operators Γ satisfying the linear constraint

$\begin{eqnarray}{\rm{\Gamma }} & = & {I}_{{A}_{N}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N)}\\ {\Tr }_{{A}_{n}^{\mathrm{in}}}[{{\rm{\Gamma }}}^{(N)}] & = & {I}_{{A}_{n-1}^{\mathrm{out}}}\otimes {{\rm{\Gamma }}}^{(N-1)},\quad n=2,\ldots ,N\\ {\Tr }_{{A}_{1}^{\mathrm{in}}}[{{\rm{\Gamma }}}^{(1)}] & = & 1,\end{eqnarray} \tag{ B.4 }$

for suitable positive operators ${{\rm{\Gamma }}}^{(n)}$ acting on ${{ \mathcal H }}_{n}^{\mathrm{in}}\otimes [{\bigotimes }_{j=1}^{n-1}(\,{{ \mathcal H }}_{j}^{\mathrm{out}}\otimes {{ \mathcal H }}_{j}^{\mathrm{in}})]$ . Observing that ${I}_{N}^{{A}^{\mathrm{out}}}$ is the Choi operator of the trace channel ${\Tr }_{{A}_{N}^{\mathrm{out}}}$ and comparing equation (B.4) with equation (12) we then obtain that every operator Γ in ${\bar{{\mathsf{S}}}}^{+}$ is the Choi operator of a network of the form (32). Hence, ${\bar{{\mathsf{S}}}}^{+}={\mathsf{DualComb}}$ . Finally, note that the set ${\mathsf{DualComb}}$ is compact and therefore the infimum is a minimum. □

Appendix C.: Proof of proposition 5

Proof. By definition, the max relative entropies are given by ${D}_{\max }({C}_{0}\parallel {C}_{1})=-\mathrm{log}\,\max \,{\mathsf{W}}$ and ${D}_{\max }(\sqrt{{\rm{\Gamma }}}{C}_{0}\sqrt{{\rm{\Gamma }}}\,\parallel \,\sqrt{{\rm{\Gamma }}}{C}_{1}\sqrt{{\rm{\Gamma }}})=-\mathrm{log}\,\max \,{\mathsf{W}}({\rm{\Gamma }})$ , with

$\begin{eqnarray*}\begin{array}{rcl}{\mathsf{W}} & := & \{w\in {\mathbb{R}}\,| \,{{wC}}_{0}\leqslant {C}_{1}\},\\ {\mathsf{W}}({\rm{\Gamma }}) & := & \{w\in {\mathbb{R}}\,| \,w\sqrt{{\rm{\Gamma }}}{C}_{0}\sqrt{{\rm{\Gamma }}}\leqslant \sqrt{{\rm{\Gamma }}}{C}_{1}\sqrt{{\rm{\Gamma }}}\}.\end{array}\end{eqnarray*}$

By construction, one has ${\mathsf{W}}\subseteq {\mathsf{W}}({\rm{\Gamma }})$ for every Γ, and therefore

$\begin{eqnarray*}&&{D}_{\max }({C}_{0}\,\parallel \,{C}_{1})\geqslant {D}_{\max }(\sqrt{{\rm{\Gamma }}}{C}_{0}\sqrt{{\rm{\Gamma }}}\,\parallel \,\sqrt{{\rm{\Gamma }}}{C}_{1}\sqrt{{\rm{\Gamma }}}).\end{eqnarray*}$

On the other hand, if ${\bar{{\mathsf{S}}}}_{+}$ contains a full-rank element ${{\rm{\Gamma }}}_{* }$ , then ${\mathsf{W}}({{\rm{\Gamma }}}_{* })={\mathsf{W}}$ . □

Appendix D.: Proof of proposition 6

Proof. Let us compute the conditional min-entropy of the output state

$\begin{eqnarray*}&&\rho =\sigma \ast {D}_{1}\ast {E}_{1}\ast {D}_{2}\ast {E}_{2}\ast \cdots \ast {E}_{N-1}\ast {D}_{N}\in {\mathsf{St}}({B}_{N}^{\mathrm{out}}\otimes {{ \mathcal B }}_{N}^{\mathrm{out}^{\prime} })\end{eqnarray*}$

(See equation (42)). By the operational characterization of the conditional min-entropy (equation (5)), we have

$\begin{eqnarray}&&{H}_{\min }{({B}_{N}^{\mathrm{out}}| {B}_{N}^{\mathrm{out}^{\prime} })}_{\rho }=\underset{\begin{array}{c}C\geqslant 0,\\ {\Tr }_{{B}_{N}^{\mathrm{out}}}[C]={I}_{{B}_{N}^{\mathrm{out}^{\prime} }}\end{array}}{\max }\displaystyle \frac{\Tr [\,\rho \,{C}^{T}\,]}{{d}_{{B}_{N}^{\mathrm{out}}}},\end{eqnarray} \tag{ D.1 }$

where C is the Choi operator of a recovery channel ${ \mathcal C }$ , which attempts to turn ρ into the maximally entangled state $| {\rm{\Phi }}\rangle$ . Substituting the expression for ρ and maximizing over the sequence $(\sigma ,{E}_{1},...,{E}_{N-1})$ , we then obtain

$\begin{eqnarray}&&\underset{\sigma ,{E}_{1},{E}_{2},...,{E}_{N-1}}{\max }\,{H}_{\min }{({B}_{N}^{\mathrm{out}}| {B}_{N}^{\mathrm{out}^{\prime} })}_{\rho }\\ &&\quad =\underset{\sigma ,{E}_{1},{E}_{2},...,{E}_{N-1},C}{\max }\,\displaystyle \frac{\Tr [(\sigma \ast {D}_{1}\ast {E}_{1}\ast {D}_{2}\ast {E}_{2}\ast ...\ast {E}_{N-1}\ast {D}_{N})\,{C}^{T}\,]}{{d}_{{B}_{N}^{\mathrm{out}}}}\\ &&\quad =\underset{\sigma ,{E}_{1},{E}_{2},...,{E}_{N-1},C}{\max }\,\displaystyle \frac{(\sigma \ast {D}_{1}\ast {E}_{1}\ast {D}_{2}\ast {E}_{2}\ast ...\ast {E}_{N-1}\ast {D}_{N})\ast C}{{d}_{{B}_{N}^{\mathrm{out}}}}\end{eqnarray} \tag{ D.2 }$

$\begin{eqnarray}&&=\underset{\sigma ,{E}_{1},{E}_{2},...,{E}_{N-1},C}{\max }\,\displaystyle \frac{E^{\prime} \ast R}{{d}_{{B}_{N}^{\mathrm{out}}}},\end{eqnarray} \tag{ D.3 }$

having defined

$\begin{eqnarray*}&&E^{\prime} =\sigma \ast {E}_{1}\cdots \ast {E}_{N-1}\ast C\qquad \mathrm{and}\qquad R={D}_{1}\ast \cdots {D}_{N}.\end{eqnarray*}$

Now, note that $E^{\prime}$ is the Choi operator of a network of the form of equation (41). Moreover, since the channel ${ \mathcal C }$ can be chosen to be the identity, $E^{\prime}$ is the Choi operator of an arbitrary network of the form of equation (41). Using equations (43) and (44) we finally obtain

$\begin{eqnarray*}&&\underset{\sigma ,{E}_{1},{E}_{2},...,{E}_{N-1}}{\max }\,{H}_{\min }{({B}_{N}^{\mathrm{out}}| {B}_{N}^{\mathrm{out}^{\prime} })}_{\rho }={H}_{\min }{({t}_{N}| {t}_{1}...{t}_{N-1})}_{R}\,.\end{eqnarray*}$

□

Appendix E.: Proof of proposition 7

Proof. The proof is based on proposition 5. Take an operator ${\rm{\Gamma }}\in {\mathsf{DualComb}}({{ \mathcal H }}_{{A}_{1}}^{\mathrm{in}},{{ \mathcal H }}_{{A}_{1}}^{\mathrm{out}},...,{{ \mathcal H }}_{{A}_{N}}^{\mathrm{in}},{{ \mathcal H }}_{{A}_{N}}^{\mathrm{out}})$ and diagonalize it as ${\rm{\Gamma }}={\sum }_{i}\,{g}_{i}\,| {\phi }_{i}\rangle \langle {\phi }_{i}|$ . Choose S to be the composite system ${A}_{1}^{\mathrm{in}}{A}_{1}^{\mathrm{out}}\cdots {A}_{N}^{\mathrm{in}}{A}_{N}^{\mathrm{out}}$ and define the vector $| {\rm{\Psi }}\rangle ={\sum }_{i}\,\sqrt{{g}_{i}}\,| {\phi }_{i}\rangle | \bar{{\phi }_{i}\rangle }\in {{ \mathcal H }}_{{A}_{1}^{\mathrm{in}}}\otimes {{ \mathcal H }}_{{A}_{1}^{\mathrm{out}}}\otimes \cdots \,{{ \mathcal H }}_{{A}_{N}^{\mathrm{in}}}\otimes \,{{ \mathcal H }}_{{A}_{N}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{S}$ . Then, the positive operator $E=| {\rm{\Psi }}\rangle \langle {\rm{\Psi }}|$ is the Choi operator of a network of the form of equation (50), as one can check from equation (12). Then, explicit calculation gives

$\begin{eqnarray*}&&{C}^{(x)}\ast E=\sqrt{{\rm{\Gamma }}}{C}^{(x)}\,\sqrt{{\rm{\Gamma }}}.\end{eqnarray*}$

Using proposition (5) we then conclude the equality

$\begin{eqnarray*}&&{D}_{\max }({C}^{(0)}\,\parallel \,{C}^{(1)})=\underset{{\rm{\Gamma }}}{\max }\,{D}_{\max }({C}^{(0)}\ast E\,\parallel \,{C}^{(1)}\ast E).\end{eqnarray*}$

□

Appendix F.: Normalization condition for supermaps on product channels

Equation (56) gives us the Choi operator C. In order for C to be the Choi operator of a channel, we must have

$\begin{eqnarray}&&{\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}}}[C]={I}_{{A}_{2}^{\mathrm{in}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}}.\end{eqnarray} \tag{ F.1 }$

Inserting equation (56), we then obtain the condition

$\begin{eqnarray}&&{\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}},{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{out}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{{ \mathcal B }}_{2}^{\mathrm{out}}}\otimes {I}_{{{ \mathcal B }}_{2}^{\mathrm{in}}})(A\otimes B)]={I}_{{A}_{2}^{\mathrm{in}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}},\end{eqnarray} \tag{ F.2 }$

which must be satisfied whenever A and B satisfy the conditions

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{out}},{A}_{2}^{\mathrm{out}}}[A]={I}_{{A}_{1}^{\mathrm{in}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\qquad \mathrm{and}\qquad {\Tr }_{{B}_{1}^{\mathrm{out}},{B}_{2}^{\mathrm{out}}}[B]={I}_{{B}_{1}^{\mathrm{in}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}}.\end{eqnarray} \tag{ F.3 }$

Now, we have the following

Proposition 11. For every operator $N$ , the following conditions are equivalent:

(i)
$N$ satisfies the condition (F.2) for every operators $A$ and $B$ satisfying the condition (F.3)
(ii)
$N$ satisfies the condition
$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[N(\tilde{A}\otimes \tilde{B}]=1\end{eqnarray} \tag{ F.4 }$
for every operators $\tilde{A}\in {\mathsf{Lin}}({{ \mathcal H }}_{{A}_{1}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{A}_{1}^{\mathrm{in}}})$ and $\tilde{B}\in {\mathsf{Lin}}({{ \mathcal H }}_{{B}_{1}^{\mathrm{out}}}\otimes {{ \mathcal H }}_{{B}_{1}^{\mathrm{in}}})$ satisfying the conditions
$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{out}}}[\tilde{A}]={I}_{{A}_{1}^{\mathrm{in}}}\qquad \mathrm{and}\qquad {\Tr }_{{B}_{1}^{\mathrm{out}}}[\tilde{B}]={I}_{{B}_{1}^{\mathrm{in}}}.\end{eqnarray} \tag{ F.5 }$

Proof. Suppose that the operators $\tilde{A}$ and $\tilde{B}$ satisfy the trace conditions (F.5). By defining the operators A and B as $A=\tilde{A}\otimes {I}_{{A}_{2}^{\mathrm{in}},{A}_{2}^{\mathrm{out}}}/{d}_{{A}_{2}^{\mathrm{in}}}$ and $B=\tilde{B}\otimes {I}_{{B}_{2}^{\mathrm{in}},{B}_{2}^{\mathrm{out}}}/{d}_{{B}_{2}^{\mathrm{in}}}$ , we see that equation (F.3) is satisfied. Then, equation (F.2) becomes

$\begin{eqnarray} & & {\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}},{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{out}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{B}_{2}^{\mathrm{out}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}})(A\otimes B)]\\ & = & {\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}},{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}\left\{\displaystyle \frac{{I}_{{A}_{2}^{\mathrm{out}}}}{{d}_{{A}_{2}^{\mathrm{in}}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\otimes [N(\tilde{A}\otimes \tilde{B})]\otimes \displaystyle \frac{{I}_{{B}_{2}^{\mathrm{out}}}}{{d}_{{B}_{2}^{\mathrm{in}}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}}\right\}={I}_{{A}_{2}^{\mathrm{in}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}}.\end{eqnarray} \tag{ F.6 }$

The above equation holds if and onyl if condition (F.4) is satisfied. Conversely, if the operator N satisfies condition (F.4) and $\tilde{A}$ and $\tilde{B}$ the trace conditions (F.5), we obtain

$\begin{eqnarray}\begin{array}{rcl} & & {\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}},{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{out}}}\otimes {I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{B}_{2}^{\mathrm{out}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}})(A\otimes B)]\\ & = & {\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{B}_{2}^{\mathrm{in}}})({\Tr }_{{A}_{2}^{\mathrm{out}},{B}_{2}^{\mathrm{out}}}[A\otimes B])]\\ & = & {\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{B}_{2}^{\mathrm{in}}})(\bar{A}\otimes \bar{B})],\end{array}\end{eqnarray} \tag{ F.7 }$

where we defined ${\Tr }_{{A}_{2}^{\mathrm{out}}}[A]=\bar{A}$ and ${\Tr }_{{B}_{2}^{\mathrm{out}}}[B]=\bar{B}$ . Hence, equation (F.2) holds if and only if

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[({I}_{{A}_{2}^{\mathrm{in}}}\otimes N\otimes {I}_{{B}_{2}^{\mathrm{in}}})(\bar{A}\otimes \bar{B})]={I}_{{A}_{2}^{\mathrm{in}}}\otimes {I}_{{B}_{2}^{\mathrm{in}}}.\end{eqnarray} \tag{ F.8 }$

In turn, the above equation holds if and only if

$\begin{eqnarray}&&{\Tr }_{{A}_{1}^{\mathrm{in}},{A}_{1}^{\mathrm{out}},{B}_{1}^{\mathrm{in}},{B}_{1}^{\mathrm{out}}}[N({\bar{A}}_{\rho }\otimes {\bar{B}}_{\sigma })]=1,\qquad \forall \rho \in {\mathsf{St}}({{ \mathcal H }}_{{A}_{2}^{\mathrm{in}}}),\forall \sigma \in {\mathsf{St}}({{ \mathcal H }}_{{B}_{2}^{\mathrm{in}}}),\end{eqnarray} \tag{ F.9 }$

where ${\bar{A}}_{\rho }$ and ${\bar{B}}_{\sigma }$ are defined as

$\begin{eqnarray}&&{\bar{A}}_{\rho }:= {\Tr }_{{A}_{2}^{\mathrm{in}}}[(\rho \otimes {I}_{{A}_{1}^{\mathrm{out}}{A}_{1}^{\mathrm{in}}})\bar{A}]\qquad \mathrm{and}\qquad {\bar{B}}_{\sigma }:= {\Tr }_{{B}_{2}^{\mathrm{in}}}[(\rho \otimes {I}_{{B}_{1}^{\mathrm{out}}{B}_{1}^{\mathrm{in}}})\bar{B}].\end{eqnarray} \tag{ F.10 }$

Now, the normalization condition (F.9) is nothing but equation (F.4). The condition is satisfied because the operators ${\bar{A}}_{\rho }$ and ${\bar{B}}_{\sigma }$ satisfy condition (F.5). □

Appendix G.: Maximum fidelity for the inversion of an unknown dynamics

The performance operator Ω reads

$\begin{eqnarray}{\rm{\Omega }} & = & \displaystyle \frac{1}{{d}^{2}}\displaystyle \int {\rm{d}}U\,| {U}^{\dagger } \rangle\rangle \langle\langle {U}^{\dagger }{| }_{30}\otimes | \bar{U} \rangle\rangle \langle\langle \bar{U}{| }_{21}\\ & = & \displaystyle \frac{1}{{d}^{2}}\displaystyle \int {\rm{d}}U\,({I}_{3}\otimes {\bar{U}}_{0}\otimes {\bar{U}}_{2}\otimes {I}_{1})(\,| I \rangle\rangle \langle\langle I{| }_{30}\otimes | I \rangle\rangle \langle\langle I{| }_{21})\,{({I}_{3}\otimes {\bar{U}}_{0}\otimes {\bar{U}}_{2}\otimes {I}_{1})}^{\dagger }.\end{eqnarray} \tag{ G.1 }$

Explicit calculation using Schur's lemma yields the relations

$\begin{eqnarray}&&[{\rm{\Omega }},{I}_{3}\otimes {U}_{2}\otimes {I}_{1}\otimes {U}_{0}]=0,\end{eqnarray} \tag{ G.2 }$

$\begin{eqnarray}&&[{\rm{\Omega }},{U}_{3}\otimes {I}_{2}\otimes {U}_{1}\otimes {I}_{0}]=0,\end{eqnarray} \tag{ G.3 }$

required to hold for every unitary U. Explicitly, the operator Ω is given by

$\begin{eqnarray}&&{\rm{\Omega }}=\displaystyle \frac{1}{{d}^{2}}\left(\displaystyle \frac{{P}_{+,31}\otimes {P}_{+,20}}{{d}_{+}}+\displaystyle \frac{{P}_{-,31}\otimes {P}_{-,20}}{{d}_{-}}\right),\end{eqnarray} \tag{ G.4 }$

${P}_{+}$ and ${P}_{-}$ are the projectors on the symmetric and antisymmetric subspace, respectively.

The problem is to find the minimum λ such that $\lambda {\rm{\Gamma }}\geqslant {\rm{\Omega }}$ , for Γ satisfying the conditions (76). The first condition requires Γ to be of the form ${\rm{\Gamma }}={I}_{3}\otimes {T}_{210}$ . Now, equation (G.3) implies that, without loss of generality, the operator T₂₁₀ can be chosen to satisfy the condition

$\begin{eqnarray}&&[{T}_{210},{I}_{2}\otimes {U}_{1}\otimes {I}_{0}]=0\qquad \forall \ U\in {\mathsf{SU}}(d)\end{eqnarray} \tag{ G.5 }$

which in turn implies

$\begin{eqnarray}&&{T}_{210}={Q}_{20}\otimes {I}_{1},\end{eqnarray} \tag{ G.6 }$

where Q₂₀ is some positive operator on ${{ \mathcal H }}_{20}$ . Similarly, equation (G.2) implies that we can choose T₂₁₀ to satisfy the condition

$\begin{eqnarray}&&[{T}_{210},{U}_{2}\otimes {I}_{1}\otimes {U}_{0}]=0,\qquad \forall \ U\in {\mathsf{SU}}(d).\end{eqnarray} \tag{ G.7 }$

Combined with equation (G.6), the above relation implies

$\begin{eqnarray}&&[{Q}_{20},{U}_{2}\otimes {U}_{0}]=0\qquad \forall \ U\in {\mathsf{SU}}(d)\end{eqnarray} \tag{ G.8 }$

and therefore

$\begin{eqnarray}&&{Q}_{20}=\alpha {P}_{+}+\beta {P}_{-}.\end{eqnarray} \tag{ G.9 }$

Finally, the last condition in equation (76) gives $\Tr [{Q}_{20}]=1$ and, therefore,

$\begin{eqnarray*}&&\alpha {d}_{+}+\beta {d}_{-}=d.\end{eqnarray*}$

The dual constraint $\lambda \,{\rm{\Gamma }}\geqslant {\rm{\Omega }}$ then reads

$\begin{eqnarray}&&\lambda [\,\alpha ({I}_{31}\otimes {P}_{+,20})+\beta ({I}_{31}\otimes {P}_{-,20})\,]\geqslant \displaystyle \frac{1}{{d}^{2}}\left(\displaystyle \frac{{P}_{+,31}\otimes {P}_{+,20}}{{d}_{+}}+\displaystyle \frac{{P}_{-,31}\otimes {P}_{-,20}}{{d}_{-}}\right).\end{eqnarray} \tag{ G.10 }$

Pinching both sides with the projectors ${P}_{+,31}\otimes {P}_{+,20}$ and ${P}_{-,31}\otimes {P}_{-,20}$ , one obtains

$\begin{eqnarray}&&\lambda \geqslant \displaystyle \frac{1}{{d}_{+}{d}^{2}\,\alpha }\qquad \mathrm{and}\qquad \lambda \geqslant \displaystyle \frac{1}{{d}^{2}\,(d-\alpha {d}_{+})}\mathrm{.}\end{eqnarray} \tag{ G.11 }$

By separately considering the cases ${d}_{+}\alpha {d}^{2}\geqslant (d-{d}_{+}\alpha ){d}^{2}$ and ${d}_{+}\alpha {d}^{2}\lt (d-{d}_{+}\alpha ){d}^{2}$ , we find that the minimum λ is ${\lambda }_{\min }=2/{d}^{2}$ .

Appendix H.: Maximum fidelity for the charge conjugation of an unknown unitary evolution

The maximization of the fidelity proceeds in the same way as for gate inversion. The only difference is that now the performance operator Ω is given by equation (80), namely

$\begin{eqnarray}&&{\rm{\Omega }}=\displaystyle \frac{1}{{d}^{2}}\left(\displaystyle \frac{{P}_{+,32}\otimes {P}_{+,10}}{{d}_{+}}+\displaystyle \frac{{P}_{-,32}\otimes {P}_{-,10}}{{d}_{-}}\right).\end{eqnarray} \tag{ H.1 }$

The form of Ω implies the relations

$\begin{eqnarray}&&[{\rm{\Omega }},{U}_{3}\otimes {U}_{2}\otimes {I}_{10}]=0,\end{eqnarray} \tag{ H.2 }$

$\begin{eqnarray}&&[{\rm{\Omega }},{I}_{32}\otimes {U}_{1}\otimes {U}_{0}]=0,\end{eqnarray} \tag{ H.3 }$

valid for every U in ${\mathsf{SU}}(d)$ . Now, one has to find the minimum λ such that $\lambda ({I}_{3}\otimes {T}_{210})\geqslant {\rm{\Omega }}$ , with some Γ satisfying equations (76). Equation (H.2) implies that, without loss of generality, one has

$\begin{eqnarray}&&[{T}_{210},{U}_{2}\otimes {I}_{10}]=0\qquad \forall U\in {\mathsf{SU}}(d),\end{eqnarray} \tag{ H.4 }$

and therefore ${T}_{210}={I}_{2}\otimes {Q}_{10}$ . Moreover, the second condition in equation (76) reads

$\begin{eqnarray*}&&{\Tr }_{2}[{T}_{210}]={I}_{1}\otimes {\rho }_{0}\end{eqnarray*}$

and implies that Q₁₀ has the form ${Q}_{10}={I}_{1}\otimes {\rho }_{0}/d$ . Finally, equation (H.3) implies that one can choose ${\rho }_{0}=I/d$ without loss of generality. Summing everything up, Γ can be chosen to be of the form ${\rm{\Gamma }}={I}_{3}\otimes {T}_{210}={I}_{3210}/{d}^{2}$ . The dual constraint $\lambda {\rm{\Gamma }}\geqslant {\rm{\Omega }}$ then becomes

$\begin{eqnarray}&&\lambda \,\displaystyle \frac{{I}_{3210}}{{d}^{2}}\geqslant \displaystyle \frac{1}{{d}^{2}}\left(\displaystyle \frac{{P}_{+,32}\otimes {P}_{+,10}}{{d}_{+}}+\displaystyle \frac{{P}_{-,32}\otimes {P}_{-,10}}{{d}_{-}}\right)\end{eqnarray} \tag{ H.5 }$

yielding the minimal value ${\lambda }_{\min }=1/{d}_{-}=2/d(d-1)$ .

Appendix I.: Maximum fidelity for unitary controlization

The performance operator for the controlization problem is

$\begin{eqnarray}{\rm{\Omega }} & = & \displaystyle \frac{1}{4{d}^{2}}\displaystyle \int {\rm{d}}g\,| {\mathtt{ctrl}}-U \rangle\rangle \langle\langle {\mathtt{ctrl}}-U{| }_{30Q^{\prime} Q}\otimes | \bar{U} \rangle\rangle \langle\langle \bar{U}{| }_{21}\\ & = & {{\rm{\Omega }}}_{3210}^{(0)}\otimes | 0\rangle \langle 0{| }_{Q^{\prime} }\otimes | 0\rangle \langle 0{| }_{Q}+{{\rm{\Omega }}}_{3210}^{(1)}\otimes | 1\rangle \langle 1{| }_{Q^{\prime} }\otimes | 1\rangle \langle 1{| }_{Q},\end{eqnarray} \tag{ I.1 }$

where Q and $Q^{\prime}$ denote the control qubit before and after the interaction, respectively, and

$\begin{eqnarray}&&{{\rm{\Omega }}}_{3210}^{(0)}:= \displaystyle \frac{1}{4{d}^{2}}({E}_{30}\otimes {I}_{21}),\end{eqnarray} \tag{ I.2 }$

$\begin{eqnarray}&&{{\rm{\Omega }}}_{3210}^{(1)}:= \displaystyle \frac{1}{4{d}^{2}}\left({E}_{32}\otimes {E}_{10}+\displaystyle \frac{{E}_{32}^{\perp }\otimes {E}_{10}^{\perp }}{{d}_{\perp }}\right).\end{eqnarray} \tag{ I.3 }$

Here E denotes the projector on the maximally entangled state $| {{\rm{\Phi }}}^{+}\rangle =| I \rangle\rangle /\sqrt{d}$ , ${E}_{\perp }$ is the orthogonal projector ${E}_{\perp }:= {I}^{\otimes 2}-E$ , and ${d}_{\perp }:= {d}^{2}-1$ . Note that the operators ${{\rm{\Omega }}}_{3210}^{(0)}$ and ${{\rm{\Omega }}}_{3210}^{(1)}$ satisfies the conditions

$\begin{eqnarray}&&[{{\rm{\Omega }}}_{3210}^{(1)},{U}_{3}\otimes {I}_{21}\otimes {\bar{U}}_{0}]=0,\end{eqnarray} \tag{ I.4 }$

$\begin{eqnarray}&&[{{\rm{\Omega }}}_{3210}^{(1)},{U}_{3}\otimes {\bar{U}}_{2}\otimes {I}_{10}]=0,\end{eqnarray} \tag{ I.5 }$

$\begin{eqnarray}&&[{{\rm{\Omega }}}_{3210}^{(1)},{I}_{32}\otimes {\bar{U}}_{1}\otimes {U}_{0}]=0,\end{eqnarray} \tag{ I.6 }$

for every group element $U\in {\mathsf{SU}}(d)$ .

To solve the dual problem, we have to find the minimum λ satisfying the relation $\lambda \,{\rm{\Gamma }}\geqslant {\rm{\Omega }}$ for some dual comb Γ. By equation (76), we have ${\rm{\Gamma }}={I}_{b}\otimes {I}_{3}\otimes {T}_{210a}$ , for some suitable operator ${T}_{210a}$ satisfying the conditions

$\begin{eqnarray*}\begin{array}{rcl}{\Tr }_{2}[{T}_{210a}] & = & {I}_{1}\otimes {\rho }_{0a},\\ {\Tr }_{0a}[{\rho }_{0a}] & = & 1.\end{array}\end{eqnarray*}$

Without loss of generality, ${T}_{210a}$ can be chosen of the form

$\begin{eqnarray}&&{T}_{210a}={T}_{210}^{(0)}\otimes | 0\rangle \langle 0{| }_{Q}+{T}_{210}^{(1)}\otimes | 1\rangle \langle 1{| }_{Q},\end{eqnarray} \tag{ I.7 }$

with the operators ${T}_{210}^{(0)}$ and ${T}_{210}^{(1)}$ satisfying the conditions

$\begin{eqnarray}&&{\Tr }_{2}[{T}_{210}^{(0)}]={p}_{0}[{I}_{1}\otimes {\rho }_{0}^{(0)}]\qquad \mathrm{and}\qquad {\Tr }_{2}[{T}_{210}^{(1)}]={p}_{1}[{I}_{1}\otimes {\rho }_{0}^{(1)}],\end{eqnarray} \tag{ I.8 }$

where ${\rho }_{0}^{(0)}$ and ${\rho }_{0}^{(1)}$ are two density matrices and p₀ and p₁ are probabilities. The dual constraint is then reduced to

$\begin{eqnarray}&&\lambda [{I}_{3}\otimes {T}_{210}^{(k)}]\geqslant {{\rm{\Omega }}}_{3210}^{(k)},\qquad \forall k\in \{0,1\}.\end{eqnarray} \tag{ I.9 }$

At this point, equation (I.4) implies that, without loss of generality, one can choose ${T}_{210}^{(0)}$ to satisfy the relation

$\begin{eqnarray*}&&[{T}_{210}^{(0)},{I}_{21}\otimes {\bar{U}}_{0}]=0,\qquad \forall U\in {\mathsf{SU}}(d),\end{eqnarray*}$

which implies ${T}_{210}^{(0)}={Q}_{21}^{(0)}\otimes {I}_{0}$ for some suitable operator ${Q}_{21}^{(0)}$ . Moreover, equation (I.2) implies that, without loss of generality, one can choose ${Q}_{21}^{(0)}$ to be proportional to the identity, so that, eventually one has

$\begin{eqnarray}&&{T}_{210}^{(0)}={p}_{0}\,\displaystyle \frac{{I}_{2}\otimes {I}_{1}\otimes {I}_{0}}{{d}^{2}}.\end{eqnarray} \tag{ I.10 }$

Similarly, equation (I.3) implies that, without loss of generality, one can choose ${T}_{210}^{(1)}$ to satisfy the relations

$\begin{eqnarray}&&[{T}_{210}^{(1)},{\bar{U}}_{2}\otimes {I}_{10}]=0,\end{eqnarray} \tag{ I.11 }$

$\begin{eqnarray}&&[{T}_{210}^{(1)},{I}_{2}\otimes {\bar{U}}_{1}\otimes {U}_{0}]=0,\end{eqnarray} \tag{ I.12 }$

for every unitary $U\in {\mathsf{SU}}(d)$ . Now, equation (I.11) implies that ${T}_{210}^{(1)}$ has the form

$\begin{eqnarray}&&{T}_{210}^{(1)}={I}_{2}\otimes {Q}_{10}^{(1)}\end{eqnarray} \tag{ I.13 }$

and equation (I.8) implies the condition

$\begin{eqnarray*}d\,{Q}_{10}^{(1)} & = & {\Tr }_{2}[{T}_{210}^{(1)}]\\ & = & {p}_{1}[{I}_{1}\otimes {\rho }_{0}^{(1)}]\end{eqnarray*}$

for some probability p₁ and some quantum state ${\rho }_{0}^{(1)}$ . Combining equations (I.13) and (I.12) one finally obtains ${Q}_{10}^{(1)}={p}_{1}\,{I}_{1}\otimes {I}_{0}/{d}^{2}$ , and therefore

$\begin{eqnarray}&&{T}_{210}^{(1)}={p}_{1}\,\displaystyle \frac{{I}_{2}\otimes {I}_{1}\otimes {I}_{0}}{{d}^{2}}.\end{eqnarray} \tag{ I.14 }$

Inserting the above relations into the dual constraint, we then obtain

$\begin{eqnarray}&&\lambda \,{p}_{0}\,\displaystyle \frac{{I}_{3210}}{{d}^{2}}\geqslant \displaystyle \frac{1}{4{d}^{2}}({E}_{30}\otimes {I}_{21})\end{eqnarray} \tag{ I.15 }$

$\begin{eqnarray}&&\lambda \,{p}_{1}\,\displaystyle \frac{{I}_{3210}}{{d}^{2}}\geqslant \displaystyle \frac{1}{4{d}^{2}}\left({E}_{32}\otimes {E}_{10}+\displaystyle \frac{{E}_{32}^{\perp }\otimes {E}_{10}^{\perp }}{{d}_{\perp }}\right),\end{eqnarray} \tag{ I.16 }$

having used equations (I.10), and (I.14). To satisfy the constraint, the parameters $\lambda ,{p}_{0},$ and p₁ must satisfy $\lambda {p}_{0}\geqslant 1/4$ and $\lambda {p}_{1}\geqslant 1/4$ , leading to the minimum value ${\lambda }_{\min }=1/2$ .