Experimental test of an entropic measurement uncertainty relation for arbitrary qubit observables

Bülent Demirel; Stephan Sponar; Alastair A Abbott; Cyril Branciard; Yuji Hasegawa

doi:10.1088/1367-2630/aafeeb

1. Introduction

The uncertainty principle was one of the first quantum phenomena discovered without any classical analog. In 1927 Heisenberg presented his γ-ray microscope Gedankenexperiment [1] demonstrating that the position and momentum of an electron cannot be determined simultaneously with arbitrary precision. The famous uncertainty relation ${\rm{\Delta }}(Q){\rm{\Delta }}(P)\geqslant \tfrac{{\hslash }}{2}$ for position Q and momentum P [2], however, quantifies the accuracy with which a state can be prepared with respect to the observables of interest, rather than the ability to jointly measure them. For several decades, research on the uncertainty principle focused on such so-called preparation uncertainty relations.

The advent of information theory provided novel approaches to quantifying uncertainty, such as the Shannon entropy [3], with wide-ranging applications [4]; consequently, entropic uncertainty relations were formulated soon thereafter [5–7]. For finite dimensional systems, novel entropic relations such as Deutsch's [8] and Maassen and Uffink's inequalities [9] presented advantages, such as state-independence, over Robertson's relation ${\rm{\Delta }}(A){\rm{\Delta }}(B)\geqslant \left|\tfrac{1}{2i}\langle \psi | [A,B]| \psi \rangle \right|$ , for arbitrary observables A and B and any state $| \psi \rangle$ [10]. Entropic uncertainty relations have subsequently proven useful in quantum cryptography [11, 12], entanglement witnessing [13], complementarity [14] and other topics in quantum information theory [15], where entropy is a natural quantity of interest.

In recent years measurement uncertainty relations, in the spirit of Heisenberg's original proposal, have received renewed attention. Such uncertainty relations can be subdivided into two classes: noise-disturbance relations, which quantify the idea that the more accurately a measurement determines the value of an observable, the more it disturbs the state of the measured system; and noise–noise relations, which quantify the tradeoff between how accurately a measurement can jointly determine the values of two non-commuting observables. New measures and relations for noise and disturbance have been proposed [16, 17], refined [18, 19], and subjected to experimental tests [20–28]. Initially, proposed ways to quantify noise and disturbance focused on distance measures between target observables and measurements [16] or the associated probability distributions [29]. More recently, interest has grown in information-theoretic measures, introduced first by Buscemi et al [30], but also in several subsequent alternative approaches [31–34]. A major challenge in the study of entropic measurement uncertainty relations is to determine how tight they are. This can be difficult for even the simplest systems, as demonstrated in [35], where an allegedly tight noise-disturbance relation for orthogonal qubit observables was given and tested experimentally. Subsequently, however, a counterexample was found [36], showing that the relation can be violated by non-projective measurements. In this article we focus on related noise–noise uncertainty relations, experimentally testing the noise–noise tradeoff for a range of (not necessarily orthogonal) Pauli observables. By implementing four-outcome general quantum measurements we saturate tight noise–noise relations, thereby improving upon previous experiments with projective measurements [35].

2. Theoretical framework

To formally study measurement uncertainty relations one must define measures for two key properties of a measurement device ${ \mathcal M }$ (which may in general implement an arbitrary quantum measurement with any number of outcomes): how accurately it measures a target observable A (its noise), and how much it disturbs the quantum state during measurement (the disturbance). Here we are interested in noise–noise uncertainty relations and therefore restrict our discussion to the former.

While several definitions of noise have previously been studied theoretically and experimentally, we utilize the information-theoretic approach of [30], formulated as follows. Let $\{| a\rangle \}{}_{a}$ be the d eigenstates of the d-dimensional target observable A and represent ${ \mathcal M }$ as a positive-operator valued measure (POVM) ${ \mathcal M }={\{{M}_{m}\}}_{m}$ [15]. The noise is defined in the following scenario: the eigenstates of A are randomly prepared with probability $p(a)=\tfrac{1}{d}$ before ${ \mathcal M }$ is measured, producing an outcome m with probability $p(m| a)=\mathrm{Tr}({M}_{m}| a\rangle \langle a| )$ . If ${ \mathcal M }$ accurately measures A then the value of m should allow one to determine a; if the measurement is noisy, m yields less information about a. This noise is quantified in terms of the conditional Shannon entropy: denoting the random variables associated with a and m as ${\mathbb{A}}$ and ${\mathbb{M}}$ , respectively, the noise of ${ \mathcal M }$ on A is [30]

$\begin{eqnarray}&&N({ \mathcal M },A)=H({\mathbb{A}}| {\mathbb{M}})=-\displaystyle \sum _{a,m}p(a,m){\mathrm{log}}_{2}p(a| m),\end{eqnarray} \tag{ 1 }$

where $p(a,m)=p(a)p(m| a)$ and $p(a| m)$ can be calculated from Bayes' theorem⁴ .

If A and B are two non-commuting observables, the noises $N({ \mathcal M },A)$ and $N({ \mathcal M },B)$ (defined similarly) cannot both be zero. Subsequently, there is a tradeoff between these quantities which can be expressed by uncertainty relations, e.g. [9, 30]

$\begin{eqnarray}&&N({ \mathcal M },A)+N({ \mathcal M },B)\geqslant -{\mathrm{log}}_{2}\mathop{\max }\limits_{a,b}| \langle a| b\rangle {| }^{2},\end{eqnarray} \tag{ 2 }$

but such relations are often far from tight. More comprehensively, one may look to completely characterize the set

$\begin{eqnarray}&&R(A,B)=\{(N({ \mathcal M },A),N({ \mathcal M },B)):{ \mathcal M }\text{is a POVM}\}\end{eqnarray} \tag{ 3 }$

of obtainable noise values.

Recently, it has been shown [36] that for qubit measurements one has R(A, B) = conv E(A, B), where conv denotes the convex hull and E(A, B) is the set of obtainable entropic preparation uncertainty values for A and B (see appendix A). This relation, derived in [37], allows one to characterize and experimentally probe R(A, B). For projective qubit measurements, it turns out that one can obtain precisely the noise values in E(A, B), but (if A and B are such that E(A, B) is non-convex) the noise values in R(A, B)\E(A, B) can only be obtained by non-projective measurements [36].

Focusing on Pauli observables $A=\vec{a}\cdot \vec{\sigma }$ and $B=\vec{b}\cdot \vec{\sigma }$ [with $\vec{a},\vec{b}$ two unit vectors on the Bloch sphere and $\vec{\sigma }$ = (σ_x, σ_y, σ_z)], one has

$\begin{eqnarray}&&R(A,B)=\mathrm{conv}\left\{(s,t):g{\left(s\right)}^{2}+g{\left(t\right)}^{2}-2| \vec{a}\cdot \vec{b}| \,g(s)g(t)\leqslant 1-{\left(\vec{a}\cdot \vec{b}\right)}^{2}\right\},\end{eqnarray} \tag{ 4 }$

where g is the inverse of the binary entropy function h(x) defined for x ∈ [0, 1] as

$\begin{eqnarray}&&h(x)=-\displaystyle \frac{1+x}{2}{\mathrm{log}}_{2}\left(\displaystyle \frac{1+x}{2}\right)-\displaystyle \frac{1-x}{2}{\mathrm{log}}_{2}\left(\displaystyle \frac{1-x}{2}\right).\end{eqnarray} \tag{ 5 }$

When $| \vec{a}\cdot \vec{b}| \gtrsim 0.391,E(A,B)$ is convex and the entire region R(A, B) can be obtained by projective measurements; for $| \vec{a}\cdot \vec{b}| \lesssim 0.391$ it is non-convex [37–39] and saturating the noise–noise tradeoff requires four-outcome POVMs [36] (see appendix A for further theoretical details).

3. Experimental procedure

In this work, we describe an experiment probing the noise–noise tradeoff between Pauli observables A and B using neutron spin qubits. Neutrons are ideal test objects for foundational experiments, since they are described by matter waves whose polarization and trajectories can be accurately manipulated and efficiently detected.

The neutrons in our experiment are produced at the TRIGA Mark-II reactor of the Vienna University of Technology, where they are monochromatized to an average wavelength of λ = 2.02 Å and polarized by reflection on a Co–Ti supermirror. The particles entering the beam line are guided by a vertical magnetic field determining the quantization axis and specifying the incident spin as $| +z\rangle$ , the eigenvector of σ_z. The noise–noise tradeoff is then probed by implementing POVMs of the form

$\begin{eqnarray}&&{ \mathcal M }=\left\{{{qP}}_{+}({\vec{r}}_{1}),{{qP}}_{-}({\vec{r}}_{1}),(1-q){P}_{+}({\vec{r}}_{2}),(1-q){P}_{-}({\vec{r}}_{2})\right\},\end{eqnarray} \tag{ 6 }$

where ${P}_{\pm }({\vec{r}}_{i})=\tfrac{1}{2}({\mathbb{1}}\pm {\vec{r}}_{i}\cdot \vec{\sigma })$ , and the ${\vec{r}}_{i}:= {\vec{r}}_{i}({\vartheta }_{i},{\varphi }_{i})$ are unit vectors on the Bloch sphere parametrized by the spherical coordinates ${\vartheta }_{i},{\varphi }_{i}$ . For q = 1, equation (6) reduces to a single projective spin measurement along the direction ${\vec{r}}_{1}$ , see figure 1(a), while for a value of q between 0 and 1 it corresponds to a mixture of projective measurements along the directions ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ with probabilities q and $1-q$ , see figure 1(b).

**Figure 1.** (a), (b) *Measurement strategies*. The vector ${\vec{r}}_{1}({\vartheta }_{1},{\varphi }_{1})$ in the Bloch sphere (a) represents a projective measurement of the observable ${\vec{r}}_{1}\cdot \vec{\sigma }$ . The vectors ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ in (b) represent the projective measurements that are mixed with probabilities q and $1-q$ to realize the POVM ${ \mathcal M }$ of equation (6). (c) *Neutron polarimeter setup of the measurement*. The red arrow indicating the state $| +z\rangle$ is rotated in DC-Coil 1 by applying a magnetic field ${B}_{x}^{\mathrm{trans}}(\alpha )$ , before passing Analyzer 1, which projects the state onto $| +z\rangle$ , with probability q or $1-q$ (depending on α). A random number generator then selects a magnetic field ${B}_{x}^{\mathrm{RG}}$ to apply in DC-Coil 2, which prepares one of the eigenstates $| a\rangle ,| b\rangle$ of A and B. Finally, the third magnetic field ${B}_{x}^{\mathrm{PM}}(\vartheta )$ in DC-Coil 3 and Analyzer 2 realize a projective measurement in the direction ${\vec{r}}_{1}$ on the neutrons passing Analyzer 1 with probability q, or in the direction ${\vec{r}}_{2}$ on the ensemble transmitted with probability $1-q$ . The measurement direction ${\vec{r}}_{1}$ can be brought out of the $\vec{a}\vec{b}$ -plane by displacing DC-Coil 3 by ${y}_{0}(\varphi )$ . For further details, see appendix B.
Download figure:
Standard image High-resolution image

To perform the required measurements, an experimental setup (see figure 1(c)) similar to that in [22] is employed. The individual elements of the POVM are successively measured allowing the statistics for the whole POVM to be reconstructed. To this end, the initial spins are first rotated by DC-Coil 1 before being transmitted through Analyzer 1 with probabilities $q,1-q$ depending on the incident angle of spin states. After Analyzer 1, one of the observables' eigenstates $| a\rangle$ or $| b\rangle$ (with eigenvalues a, b = ±1) is generated uniformly at random by inducing an appropriately chosen rotation at DC-Coil 2. DC-Coil 3 is set so that the incoming neutrons pass Analyzer 2 with probabilities $q\mathrm{Tr}[{P}_{\pm }({\vec{r}}_{1})| a\rangle \langle a| ]$ or $(1-q)\mathrm{Tr}[{P}_{\pm }({\vec{r}}_{2})| a\rangle \langle a| ]$ , and likewise for the $| b\rangle$ eigenstates. At the end of the beam line a boron trifluoride detector registers all incoming neutrons so that, given these settings, one of ${{qP}}_{\pm }({\vec{r}}_{1})$ or $(1-q){P}_{\pm }({\vec{r}}_{2})$ is measured; each detection for one of these 4 measurement operators corresponds to a different outcome m. We thereby record the counts I_a,m of measuring outcome m when $| a\rangle$ is prepared, and similarly for I_b,m, and estimate the probabilities $p(a,m)={I}_{a,m}/{\sum }_{a,m}{I}_{a,m}$ and $p(b,m)={I}_{b,m}/{\sum }_{b,m}{I}_{b,m}$ (under a standard fair-sampling assumption), permitting the noises $N({ \mathcal M },A)$ and $N({ \mathcal M },B)$ to be calculated from equation (1).

3.1. Results

To obtain all the relevant results the following procedure is applied. We take $\vec{a}={\vec{e}}_{z}$ (the unit vector in the z direction) and choose $\vec{b}$ in the yz-plane, thus determining the value $| \vec{a}\cdot \vec{b}|$ characterizing the noise–noise tradeoff. We initially take q = 1 so that the projectors ${P}_{\pm }({\vec{r}}_{1})$ are measured on the entire neutron ensemble. The vector ${\vec{r}}_{1}=\cos ({\vartheta }_{1}){\vec{e}}_{z}+\sin ({\vartheta }_{1}){\vec{e}}_{y}$ is rotated in the interval ${\vartheta }_{1}\in [0^\circ ,180^\circ ]$ with increments of Δϑ₁ ≃ 10° (see figure 1(a)). The variation of the polar angle changes the probabilities of passing Analyzers 1 and 2 and reaching the detector, and thus of $p(a| m)$ and $p(b| m)$ . When ϑ₁ = 0 the projectors are ${P}_{\pm }({\vec{r}}_{1})={P}_{\pm }(\vec{a})$ and the probability $p(a| m)$ is maximally peaked, while $p(b| m)$ is evenly distributed. For ${\vec{r}}_{1}=\vec{b}$ the situation is exactly reversed. When ${\vec{r}}_{1}$ is in between $\vec{a}$ and $\vec{b}$ , these measurements attain the lower-left boundary of E(A, B) and are optimal amongst projective measurements. The upper-right boundaries of E(A, B) can, for completeness, be obtained by rotating ${\vec{r}}_{1}$ out of the plane spanned by $\vec{a}$ and $\vec{b}$ by an azimuthal angle φ₁ (varied experimentally by displacing DC-Coil 3, see figure 1(c)), increasing the noise with respect to both A and B (figures 2 and 3).

**Figure 2.** Plots of the noise–noise regions R(A, B) for $A=\vec{a}\cdot \vec{\sigma },$ $B=\vec{b}\cdot \vec{\sigma }$ with (a) $\vec{a}\cdot \vec{b}\simeq 0$ , (b) $\vec{a}\cdot \vec{b}\simeq 0.07$ and (c) $\vec{a}\cdot \vec{b}\simeq 0.19$ . The shaded noise–noise regions are separated here into two areas. The purple area shows the region E(A, B) reachable by projective measurements; the blue data points are measured in the $\vec{a}\vec{b}$ -plane ( ${\varphi }_{1}=\tfrac{\pi }{2}$ ) starting from ϑ₁ = 0 ( ${\vec{r}}_{1}=\vec{a}$ ) and increasing in steps of Δϑ₁ = 10°. When $\vec{a}\cdot \vec{b}=0$ the noise is symmetric around ϑ₁ = π/2; otherwise, the closed blue curve is obtained. The purple points are obtained by taking q = 1 and rotating ${\vec{r}}_{1}=\vec{a}$ (top boundary) and ${\vec{r}}_{1}=\vec{b}$ (right boundary) out of the $\vec{a}\vec{b}$ -plane by increasing the azimuthal angle φ₁. The orange area corresponds to R(A, B)\E(A, B), and the noise–noise values inside it can only be reached by POVMs; the points on its lower-left linear boundary can be obtained by a four-outcome POVM realized as a mixture of two projective measurements along some fixed directions ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ , with varying values of q, see equation (6) and figure 1(b). Outside of these regions, the values of $(N({ \mathcal M },A),N({ \mathcal M },B))$ in the hatched areas are forbidden and cannot be reached by any quantum measurement. Error bars correspond to one standard deviation arising from the Poissonian statistics of the neutron count rate.
Download figure:
Standard image High-resolution image

**Figure 2.** Plots of the noise–noise regions R(A, B) for $A=\vec{a}\cdot \vec{\sigma },$ $B=\vec{b}\cdot \vec{\sigma }$ with (a) $\vec{a}\cdot \vec{b}\simeq 0$ , (b) $\vec{a}\cdot \vec{b}\simeq 0.07$ and (c) $\vec{a}\cdot \vec{b}\simeq 0.19$ . The shaded noise–noise regions are separated here into two areas. The purple area shows the region E(A, B) reachable by projective measurements; the blue data points are measured in the $\vec{a}\vec{b}$ -plane ( ${\varphi }_{1}=\tfrac{\pi }{2}$ ) starting from ϑ₁ = 0 ( ${\vec{r}}_{1}=\vec{a}$ ) and increasing in steps of Δϑ₁ = 10°. When $\vec{a}\cdot \vec{b}=0$ the noise is symmetric around ϑ₁ = π/2; otherwise, the closed blue curve is obtained. The purple points are obtained by taking q = 1 and rotating ${\vec{r}}_{1}=\vec{a}$ (top boundary) and ${\vec{r}}_{1}=\vec{b}$ (right boundary) out of the $\vec{a}\vec{b}$ -plane by increasing the azimuthal angle φ₁. The orange area corresponds to R(A, B)\E(A, B), and the noise–noise values inside it can only be reached by POVMs; the points on its lower-left linear boundary can be obtained by a four-outcome POVM realized as a mixture of two projective measurements along some fixed directions ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ , with varying values of q, see equation (6) and figure 1(b). Outside of these regions, the values of $(N({ \mathcal M },A),N({ \mathcal M },B))$ in the hatched areas are forbidden and cannot be reached by any quantum measurement. Error bars correspond to one standard deviation arising from the Poissonian statistics of the neutron count rate.
Download figure:
Standard image High-resolution image

**Figure 3.** Plots of the noise–noise regions $R(A,B)$ with (a) $\vec{a}\cdot \vec{b}\simeq 0.35$ , and (b) $\vec{a}\cdot \vec{b}\simeq 0.5$ . These cases are on either side of the value $\vec{a}\cdot \vec{b}\simeq 0.391$ at which E(A, B) becomes convex. In (a), due to the size of the error bars (as specified in figure 2), improvements beyond projective measurements to obtain noise values on the orange dashed line are experimentally no longer possible, while in (b) the optimal theoretical tradeoff is already attained with projective measurements.
Download figure:
Standard image High-resolution image

**Figure 3.** Plots of the noise–noise regions $R(A,B)$ with (a) $\vec{a}\cdot \vec{b}\simeq 0.35$ , and (b) $\vec{a}\cdot \vec{b}\simeq 0.5$ . These cases are on either side of the value $\vec{a}\cdot \vec{b}\simeq 0.391$ at which E(A, B) becomes convex. In (a), due to the size of the error bars (as specified in figure 2), improvements beyond projective measurements to obtain noise values on the orange dashed line are experimentally no longer possible, while in (b) the optimal theoretical tradeoff is already attained with projective measurements.
Download figure:
Standard image High-resolution image

For $| \vec{a}\cdot \vec{b}| \gtrsim 0.391$ projective measurements are optimal and this approach saturates the noise–noise tradeoff. This is no longer true for $| \vec{a}\cdot \vec{b}| \lesssim 0.391$ and the noise may be decreased further by non-projective measurements. To saturate the tradeoff and attain the lower-left boundary of $R(A,B)$ the mixing parameter q is varied to implement the full four-outcome POVM ${ \mathcal M }$ . This is done by mixing the statistics obtained by the projectors ${P}_{\pm }({\vec{r}}_{1})$ and ${P}_{\pm }({\vec{r}}_{2})$ , where the polar angles ϑ₁ and ϑ₂, associated with ${\vec{r}}_{1},{\vec{r}}_{2}$ , are determined by the projective measurements ${{ \mathcal M }}_{i}$ minimizing $N({{ \mathcal M }}_{i},A)+N({{ \mathcal M }}_{i},B)$ . With these angles fixed (giving vectors ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ between $\vec{a}$ and $\vec{b}$ , symmetric about their angle bisector; see figure 1(b)), a range of POVMs are implemented by varying q, i.e. changing the ratio of transmitted neutrons in Analyzer 1, by Δq ≃ 0.1. These measurements attain the boundary of the orange noise–noise region R(A, B) in figure 2.

The measurement results for three different vectors $\vec{b}$ (for which E(A, B) is non-convex) are given in figure 2, with (a) $\vec{a}\cdot \vec{b}\simeq 0$ , (b) $\vec{a}\cdot \vec{b}\simeq 0.07$ and (c) $\vec{a}\cdot \vec{b}\simeq 0.19$ . The noise–noise region R(A, B) is broken into two subregions: the purple region E(A, B) of values attainable with projective measurements, and the orange region R(A, B)\E(A, B). The closed blue curve shows the values attainable by projective measurements in the $\vec{a}\vec{b}$ -plane, while the dashed orange line shows the optimal values attainable with POVMs. In figure 2(a), $\vec{a}$ is perpendicular to $\vec{b}$ and projective measurements ${ \mathcal M }$ in the $\vec{a}\vec{b}$ -plane give noise values that lie on the lower-left boundary (blue curve) of E(A, B). To saturate the noise–noise tradeoff, we mix projective measurements in directions ${\vec{r}}_{1}=\vec{a}$ and ${\vec{r}}_{2}=\vec{b}$ . The resulting points are color coded from red (q = 0) to orange (q = 1). We see that POVMs give a considerable improvement on the uncertainty relation over projective measurements, which previous experiments had been restricted to [35]. For instance, for the data point corresponding to q ≃ 0.494 on figure 2(a), we obtain noise values of $(N({ \mathcal M },A),N({ \mathcal M },B))=(0.511\pm 0.012,0.529\pm 0.021)$ . This violates the relation $g{(N{({ \mathcal M },A))}^{2}+g(N({ \mathcal M },B))}^{2}\leqslant 1$ satisfied by all projective measurements (corresponding to the lower boundary of the purple region E(A, B), see appendix A), by more than 6 standard deviations. When the eigenstates of B approach those of A, as is the case in figure 2(b), the lower boundary of R(A, B) (orange) and, more noticeably the purple region E(A, B) start shifting downwards. This becomes more apparent in figure 2(c), where the optimal choice of ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ to mix is obtained for the projective measurements giving $(N({ \mathcal M },A),N({ \mathcal M },B))\simeq (0.02,0.95)$ and $(N({ \mathcal M },A),N({ \mathcal M },B))\simeq (0.95,0.02)$ , respectively, corresponding to ϑ₁ ≃ 5° and ϑ₂ ≃ 74° ( $\arccos (\vec{a}\cdot \vec{b})\simeq 79^\circ$ ). By realizing the POVM accordingly for a range of q values we again succeed in saturating the noise–noise tradeoff. (See appendix C for further quantitative details of the measurements.)

In figure 3 we present two cases with inner products (a) $\vec{a}\cdot \vec{b}\simeq 0.35$ and (b) $\vec{a}\cdot \vec{b}\simeq 0.5$ , on either side of the critical value of $| \vec{a}\cdot \vec{b}| \simeq 0.391$ at which the region E(A, B) becomes convex. In figure 3(a) the dashed orange line from $(N({ \mathcal M },A),N({ \mathcal M },B))\simeq (0.17,0.70)$ to (0.70, 0.17) implies that projective measurements are theoretically incapable of saturating the noise–noise tradeoff, but improvements by POVMs are no longer resolvable in our experiment. In figure 3(b) the region R(A, B) is already convex and can be fully attained with projective measurements, hence improvements by general POVMs are no longer possible.

4. Discussion

A measurement device cannot jointly measure two non-commuting observables with arbitrary precision, and thus there is a tradeoff between the accuracy with which they can both be measured, captured by noise–noise uncertainty relations. Using a definition of noise that quantifies how well a measurement device can distinguish eigenstates of non-commuting observables [30], we experimentally tested tight entropic noise–noise uncertainty relations for qubits [36] for various pairs of Pauli spin observables. For closely aligned observables, we saw that the uncertainty relation could be saturated with simple projective measurements. However, we verified experimentally that this is not generally the case and that four-outcome POVMs yield better measurement results that saturate the uncertainty relation when projective measurements cannot. It is interesting to note that advantages accorded by POVMs over projective measurements have also been reported for other features of measurements [40], and it would be interesting to clarify this connection further in the future. Our study, which focused on noise–noise relations, paves the way for further experiments testing entropic noise-disturbance relations [30]. For such relations on qubit systems general quantum measurements again offer advantages over projective measurements, but their experimental realization necessitates the implementation of non-trivial post-measurement transformations on the measured states [36].

Acknowledgments

BD, SS and YH acknowledge support by the Austrian science fund (FWF) Projects No. P30677-N20 and No. P27666-N20. AA and CB acknowledge financial support from the Retour Post-Doctorants program (ANR-13-PDOC-0026) of the French National Research Agency.

Appendix A.: Theoretical framework

Let us first give a more detailed presentation of the framework in which the information-theoretical noise is defined, as well as on the form of the regions R(A, B) and E(A, B). Further details can be found in [30, 36].

A.1. Operational scenario defining information-theoretic noise

The most general model of a quantum measurement is that of a quantum instrument, which completely describes both the statistics of a measurement and the transformation it induces on the measured system. To define the noise, however, only the statistics (and not the transformation) are of interest to us, and these can be described by a POVM. Recall that a POVM ${ \mathcal M }$ is a collection ${\{{M}_{m}\}}_{m}$ of Hermitian positive semidefinite operators M_m satisfying ${\sum }_{m}{M}_{m}={\mathbb{1}}$ , where ${\mathbb{1}}$ is the identity operator. For a given quantum state ρ, the probability of obtaining outcome m is $\mathrm{Tr}[{M}_{m}\rho ]$ .

The information-theoretic definition of noise $N({ \mathcal M },A)$ is best understood in the operational framework described in the main text, and illustrated in figure A1 below. For simplicity, let A be a d-dimensional non-degenerate observable with eigenstates $\{| a\rangle \}{}_{a}$ . The operational scenario can be seen as an experiment in which the eigenstates $| a\rangle$ are prepared uniformly at random, i.e. with probability $p(a)=\tfrac{1}{d}$ , before being measured by ${ \mathcal M }$ . The result of the measurement is the outcome m with probability $p(m| a)=\mathrm{Tr}[{M}_{m}| a\rangle \langle a| ];$ note that a non-projective measurement may have more than d outcomes. One thus has the joint distribution $p(a,m)=p(a)p(m| a)=\tfrac{1}{d}\mathrm{Tr}[{M}_{m}| a\rangle \langle a| ]$ specifying the probability of preparing $| a\rangle$ and obtaining outcome m. It will be convenient to denote the random variables corresponding to a and m by ${\mathbb{A}}$ and ${\mathbb{M}}$ , respectively, where we use the double-struck letters to differentiate the classical random variable ${\mathbb{A}}$ from the quantum observable A.

**Figure A1.** A schematic of the operational scenario defining the noise $N({ \mathcal M },A)$ of a measurement ${ \mathcal M }$ with respect to the target observable A [36]. The eigenstates $| a\rangle$ of A are prepared uniformly at random before being measured by ${ \mathcal M }$ , which produces an outcome m.
Download figure:
Standard image High-resolution image

Given a particular measurement outcome m one may ask what state $| a\rangle$ was prepared. If the measurement is noiseless, one should be able to determine this with certainty; conversely, the uncertainty in which eigenstate of A was prepared, given m, is used to quantify the noise of ${ \mathcal M }$ with respect to A. More precisely, this is quantified via the conditional Shannon entropy as [36]

$\begin{eqnarray}&&N({ \mathcal M },A)=H({\mathbb{A}}| {\mathbb{M}})=-\displaystyle \sum _{a,m}p(a,m){\mathrm{log}}_{2}p(a| m),\end{eqnarray} \tag{ A.1 }$

where

$\begin{eqnarray}&&p(a| m)=\displaystyle \frac{p(a,m)}{{\displaystyle \sum }_{a}p(a,m)}=\mathrm{Tr}\left[| a\rangle \langle a| \displaystyle \frac{{M}_{m}}{\mathrm{Tr}[{M}_{m}]}\right].\end{eqnarray} \tag{ A.2 }$

A large conditional Shannon entropy $H({\mathbb{A}}| {\mathbb{M}})$ means there is a lot of uncertainty in the value of a (i.e. the eigenstate prepared) given an observation m, so this definition indeed quantifies the intuitive notion of noise discussed above.

The noise $N({ \mathcal M },A)$ can thus be easily measured by preparing randomly the eigenstates $| a\rangle$ of A before measuring them and estimating the joint distribution p(a, m) from the observed incident counts. To probe the noise–noise tradeoff, $N({ \mathcal M },A)$ and $N({ \mathcal M },B)$ must both be calculated, which requires performing two such experiments (preparing randomly the states $| b\rangle$ in the second). In practice, both experiments can be performed simultaneously by preparing the eigenstates $\{| a\rangle ,| b\rangle \}{}_{a,b}$ at random with probability $\tfrac{1}{2d}$ and separating out the statistics p(a, m) and p(b, m); this is precisely what we do in the experiment described in the main text.

A.2. $R(A,B)$ and its connection with entropic preparation uncertainty

The set $R(A,B)=\left\{\left(N({ \mathcal M },A),N({ \mathcal M },B)\right):{ \mathcal M }{\rm{}}\,{\rm{is}}\,{\rm{a}}\,{\rm{valid}}\,{\rm{POVM}}\right\}$ of obtainable noise values completely characterizes the noise–noise tradeoff relation. Characterizing R(A, B) is, in general, difficult due to the nonlinearity of A.1 and the need to consider the noise obtainable by arbitrary POVMs (which themselves are not easily characterized beyond the simplest systems) [30, 36]. To do so for qubit measurements, we exploit a relation to entropic preparation uncertainty relations. Let $H(A| \rho )$ be the measurement entropy of A for a state ρ, defined as

$\begin{eqnarray}&&H(A| \rho )=-\displaystyle \sum _{a}\mathrm{Tr}\left[| a\rangle \langle a| \rho \right]{\mathrm{log}}_{2}\mathrm{Tr}\left[| a\rangle \langle a| \rho \right].\end{eqnarray} \tag{ A.3 }$

The entropic preparation region

$\begin{eqnarray}&&E(A,B)=\left\{\left(H(A| \rho ),H(B| \rho )\right):\rho {\rm{}}\,{\rm{is}}\,{\rm{a}}\,{\rm{density}}\,{\rm{matrix}}\right\}\end{eqnarray} \tag{ A.4 }$

characterizes how well-defined the values of A and B can be for any quantum state ρ, and is a key object in the study of entropic preparation uncertainty relations [37].

In [36] it was shown that $R(A,B)\subseteq \mathrm{conv}\ E(A,B)$ , where $\mathrm{conv}$ denotes the convex hull. Moreover, the authors showed that one has equality for qubit systems, and for such systems E(A, B) is well-understood. Indeed, it has been shown that for Pauli observables $A=\vec{a}\cdot \vec{\sigma }$ and $B=\vec{b}\cdot \vec{\sigma }$ [37]

$\begin{eqnarray}&&E(A,B)=\left\{(s,t):g{\left(s\right)}^{2}+g{\left(t\right)}^{2}-2| \vec{a}\cdot \vec{b}| \,g(s)g(t)\leqslant 1-{\left(\vec{a}\cdot \vec{b}\right)}^{2}\right\}\end{eqnarray} \tag{ A.5 }$

from which one obtains equation (4) of the main text (after which g is defined). By exploiting the fact that E(A, B) is convex when $| \vec{a}\cdot \vec{b}| \gtrsim 0.391$ , for which one thus has R(A, B) = E(A, B), one can, for such observables, write the explicit tight noise–noise uncertainty relation

$\begin{eqnarray}&&g{\left(N({ \mathcal M },A)\right)}^{2}+g{\left(N({ \mathcal M },B)\right)}^{2}-2| \vec{a}\cdot \vec{b}| \,g\left(N({ \mathcal M },A)\right)g\left(N({ \mathcal M },B)\right)\leqslant 1-{\left(\vec{a}\cdot \vec{b}\right)}^{2}.\quad \end{eqnarray} \tag{ A.6 }$

When $| \vec{a}\cdot \vec{b}| \lesssim 0.391$ it is not possible to have an explicit inequality in this way. Nonetheless, the boundary E(A, B) can be found and expressed in a piecewise form. Indeed, note that only the 'lower boundary' of E(A, B) (i.e. the points (s,t) on the boundary of E(A, B) for which there are no points (u,v) in E(A, B) with u < s or v < t) is non-convex for $| \vec{a}\cdot \vec{b}| \lesssim 0.391$ (see figure 2 of the main text). The convex hull of E(A, B) can be readily computed numerically and one thus obtains a linear boundary between two points $\left({N}_{1}^{* }({ \mathcal M },A),{N}_{1}^{* }({ \mathcal M },B)\right)$ and $\left({N}_{2}^{* }({ \mathcal M },A),{N}_{2}^{* }({ \mathcal M },B)\right)$ (corresponding to the two points obtained by the projective measurements that must be mixed to saturate the lower boundary of R(A, B)) and the curve given by the points saturating equation (A.6) elsewhere.

For orthogonal Pauli measurements $\vec{a}\cdot \vec{b}=0$ , it is worth noting that one has simply the tight inequality

$\begin{eqnarray}&&N({ \mathcal M },A)+N({ \mathcal M },B)\geqslant 1,\end{eqnarray} \tag{ A.7 }$

which takes precisely the same form as the Maassen and Uffink inequality (see equation (2) in the main text). In contrast, projective measurements (which can only give points in E(A, B)) satisfy the relation

$\begin{eqnarray}&&g{\left(N({ \mathcal M },A)\right)}^{2}+g{\left(N({ \mathcal M },B)\right)}^{2}\leqslant 1,\end{eqnarray} \tag{ A.8 }$

which can therefore be violated by POVMs.

Experimentally, we are interested in saturating the noise–noise tradeoff relation, achievable by performing measurements for which $\left(N({ \mathcal M },A),N({ \mathcal M },B)\right)$ is on the boundary of R(A, B). Of particular interest is the 'lower boundary' of R(A, B); measurements obtaining points on this are optimal with respect to the noise–noise tradeoff. In order to obtain such points when E(A, B) is non-convex, one must consider non-projective measurements. In particular, to this end we implement four-outcome POVMs which correspond to probabilistic mixtures of projective measurements (and recording which projective measurement was performed). Projective measurements suffice to obtain all points in E(A, B) (and thus in R(A, B) with E(A, B) is already convex).

Appendix B.: Experimental techniques

The experimental approach we use to probe uncertainty relations with neutron spins is similar to that used in [22], and further details on the general approach can be found therein. The primary additional challenge in the present experiment is to implement the four-outcome POVMs needed to probe the tight entropic uncertainty relation.

The initially polarized neutrons encounter two supermirror analyzers as shown in figure 1(c), which separate the neutrons stochastically according to their up and down spins by reflection on a magnetic multilayer structure, with the transmitted beams continuing to the next stage of the experiment. The polarizer functions as a Stern–Gerlach magnet with a similar working principle. Each supermirror has a probability of transmitting the neutrons depending on the relative angle between the incident spin and the analyzer orientation. The orientations of the analyzers are kept static along the positive z direction (aligned with $| +z\rangle$ ) while the incoming neutron spin state is changed dynamically by the first and third DC-coils.

The three DC-coils in the setup all work identically. Static magnetic fields are generated by direct currents fed into wires arranged as solenoids. One wire spirals helically along the vertical axis and another wire winds around the x-axis perpendicular to the neutron's y-direction of propagation. The large dimensions of the coils guarantee that the magnetic field inside the solenoids can be regarded as homogeneous for the neutrons. The purpose of the vertical magnetic field is to compensate for the exterior guide field and Earth's magnetic field, which are effectively nullified in the solenoids.

The purpose of the lateral coil is to induce a unitary Larmor precession of the initial spin. Classically, the field in the solenoid exerts a torque on the polarization vector, which quantum-mechanically is described by the unitary operator

$\begin{eqnarray}&&U(\alpha )=\exp \left({\rm{i}}\displaystyle \frac{\alpha }{2}{\sigma }_{x}\right)=\exp \left({\rm{i}}\displaystyle \frac{\gamma {B}_{x}t}{2}{\sigma }_{x}\right),\end{eqnarray} \tag{ B.1 }$

where the rotation angle α = γ B_x t is defined by the magnitude of the magnetic field B_x in x-direction, γ being the neutron gyromagnetic factor γ = $1.833\times {10}^{8}\,{\rm{rad}}\,{{\rm{s}}}^{-{\rm{1}}}\,{{\rm{T}}}^{-1}$ , and the time of flight through the solenoid $t=\tfrac{l}{{v}_{n}}\cong \tfrac{0.019\,{\rm{m}}}{1958\,{\rm{m}}\,{{\rm{s}}}^{-1}}\,\approx 10\,\mu {\rm{s}}$ . The rotation angle α is controlled by the current that generates the magnetic field.

Let $| \psi \rangle \equiv {U}_{1}(\alpha )| +z\rangle =\exp ({\rm{i}}\tfrac{\alpha }{2}{\sigma }_{x})| +z\rangle$ be the state prior to Analyzer 1 and $| -\psi \rangle \equiv {U}_{1}(\alpha +\pi )| +z\rangle \,=\exp \left({\rm{i}}\tfrac{\alpha +\pi }{2}{\sigma }_{x}\right)| +z\rangle$ be the orthogonal spin state. The ideal approach to implementing the four-outcome POVM in equation (6) would be to perform the projective measurement ${\{{P}_{\pm }({\vec{r}}_{1})\}}_{\pm }$ on the sub-ensemble of neutrons transmitted (with probability q = $| \langle +z| \psi \rangle {| }^{2}={\cos }^{2}\left(\tfrac{\alpha }{2}\right)$ ) by Analyzer 1 and the projective measurement ${\{{P}_{\pm }({\vec{r}}_{2})\}}_{\pm }$ on the reflected sub-ensemble(i.e. with probability $1-q$ = $| \langle +z| -\psi \rangle {| }^{2}={\sin }^{2}\left(\tfrac{\alpha }{2}\right)$ ). In practice, instead of implementing four separate beams and detectors, at each analyzer the reflected parts are discarded and the four operators are measured sequentially by applying the appropriate rotations at DC-Coils 1 and 3. These four configurations, corresponding to the four POVM elements, along with the randomly chosen state to prepare ( $| \pm a\rangle$ or $| \pm b\rangle$ ) are thus cycled in 60 s slots while keeping the total beam intensity de facto constant. The counts I_a,m and I_b,m for each combination of state preparation (±a or ±b) and outcome m are thereby obtained and recorded.

Experimentally, this means that in order to measure the POVM elements ${{qP}}_{\pm }({\vec{r}}_{1}),\alpha$ is chosen so that $| \langle +z| \psi \rangle {| }^{2}=q$ and DC-Coil 3 plus Analyzer 2 are conditioned to measure ${P}_{\pm }({\vec{r}}_{1})=| \pm {\vec{r}}_{1}\rangle \langle \pm {\vec{r}}_{1}|$ , where $| +{\vec{r}}_{1}\rangle ={U}_{3}({\vartheta }_{1},{\varphi }_{1})| +z\rangle$ and $| -{\vec{r}}_{1}\rangle ={U}_{3}({\vartheta }_{1}+\pi ,-{\varphi }_{1})| +z\rangle$ (since DC-Coil 3 controls the angle of the neutrons before Analyzer 2, and thus both the direction of the projection and which outcome ± is measured). To measure $(1-q){P}_{\pm }({\vec{r}}_{2}),\alpha$ is changed to α + π so that $| \langle +z| \psi \rangle {| }^{2}=1-q$ and DC-Coil 3 plus Analyzer 2 measure instead ${P}_{\pm }({\vec{r}}_{2})=| \pm {\vec{r}}_{2}\rangle \langle \pm {\vec{r}}_{2}|$ (with $| +{\vec{r}}_{2}\rangle ={U}_{3}({\vartheta }_{2},{\varphi }_{2})| +z\rangle$ and $| +{\vec{r}}_{2}\rangle ={U}_{3}({\vartheta }_{2}+\pi ,-{\varphi }_{2})| +z\rangle$ ). The preparation of the desired state is effectuated by a rotation induced by DC-Coil 2, which is randomly chosen based on the signal from a uniform random number generator. As described in the main text, $| \pm a\rangle$ correspond to directions $\pm {\vec{e}}_{z}$ , while $| \pm b\rangle$ are chosen in the yz-plane.

After monochromatization and polarization the neutron count rate at the tangential beam port is roughly $2000\,{{\rm{cm}}}^{-2}\,{{\rm{s}}}^{-1}$ . The count rate at the last detector, which is ≈3 m from the polarizer, is approximately 40 neutrons per second at maximum, which is affected by the beam divergence of approximately 1^◦, the transmission efficiency of the supermirrors (40%) and the scattering and absorption of neutrons in the copper wires of the coils. The detection efficiency for thermal neutrons is almost 1, owing to the high absorption cross section of ${}^{10}{\rm{B}}$ enriched BF₃ gas detector (cylindrical counter tube with 6 cm opening diameter and 40 cm length). The discrete counts at fixed rates in time are described by a Poisson distribution, which implies that one standard deviation of statistical error is given by the square root of the mean value. Depolarization through ambient magnetic fields is suppressed by a 13 Gauss magnetic guide field. A small imperfect spin separation in the supermirror leads to a slight mixture of spin states and therefore to a loss of contrast from 100% to roughly 98%. To cope with this systematic imperfection, the intensity modulation of the polarization is fitted with $\tfrac{1}{2}(c+d\cos (x))$ which, in the ideal case would simply be $\tfrac{1}{2}(1+\cos (x))$ . In order to take the efficiency of the detector into account, not the absolute, but the relative values of the fit parameters c, d are used.

Appendix C.: Additional details on the data evaluation

For each pair of measurements A and B for which the uncertainty relation was to be probed, a range of different measurements ${ \mathcal M }$ were implemented using the experimental methods described in the previous section (each giving one point on figures 2 or 3 of the main text). For each such measurement (and choice of observables) the counts I_a,m and I_b,m are obtained, where m = 1, 2, 3, 4 (corresponding to the outcomes of ${ \mathcal M }=\left\{{{qP}}_{+}({\vec{r}}_{1}),{{qP}}_{-}({\vec{r}}_{1}),(1-q){P}_{+}({\vec{r}}_{2}),(1-q){P}_{-}({\vec{r}}_{2})\right\}$ ), giving a total of 16 counts.

In order to calculate the noises $N({ \mathcal M },A)$ and $N({ \mathcal M },B)$ from these counts, the joint probability distributions p(a, m) and p(b, m) are first estimated as

$\begin{eqnarray}&&p(a,m)=\displaystyle \frac{{I}_{a,m}}{{\displaystyle \sum }_{a,m}{I}_{a,m}}\,,\qquad p(b,m)=\displaystyle \frac{{I}_{b,m}}{{\displaystyle \sum }_{b,m}{I}_{b,m}}.\end{eqnarray} \tag{ C.1 }$

From equation (A.2) one can calculate that the conditional probabilities $p(a| m)$ and $p(b| m)$ are thus given by

$\begin{eqnarray}&&p(a| m)=\displaystyle \frac{{I}_{a,m}}{{\displaystyle \sum }_{a}{I}_{a,m}}\,,\qquad p(b| m)=\displaystyle \frac{{I}_{b,m}}{{\displaystyle \sum }_{b}{I}_{b,m}}\end{eqnarray} \tag{ C.2 }$

from which the noise can be be calculated directly from equation (A.1).

In order to saturate the noise–noise uncertainty relation with POVMs, we are particularly interested in families of POVMs

$\begin{eqnarray}&&{ \mathcal M }=\left\{{{qP}}_{+}({\vec{r}}_{1}),{{qP}}_{-}({\vec{r}}_{1}),(1-q){P}_{+}({\vec{r}}_{2}),(1-q){P}_{-}({\vec{r}}_{2})\right\}\end{eqnarray} \tag{ C.3 }$

with different values of q but ${\vec{r}}_{1}$ and ${\vec{r}}_{2}$ fixed. While the target value of q is chosen, as described earlier, by controlling the current in DC-Coil 1, in practice the effective value of q might vary slightly from the desired one. This is a consequence of the application of high currents in the wires which cause slight variations of resistance, leading to fluctuations of the magnetic field, over time. More precise estimates of the effective values of q implemented can be calculated from the counts ${I}_{a,m}$ and I_b,m. To this end, note that p(m = 1) + p(m = 2) = q; in practice, one may arrive at slightly different values by calculating this from p(a, m) or p(b, m) due to statistical fluctuations, so we estimate q as the average obtained from both these distributions. We thus have

$\begin{eqnarray}\begin{array}{rcl}q & = & \displaystyle \frac{1}{2}\left(\displaystyle \sum _{a}\left[p(a,m=1)+p(a,m=2)\right]+\displaystyle \sum _{b}\left[p(b,m=1)+p(b,m=2)\right]\right)\\ & = & \displaystyle \frac{1}{2}\left(\displaystyle \frac{{\displaystyle \sum }_{a}({I}_{a,1}+{I}_{a,2})}{{\displaystyle \sum }_{a,m}{I}_{a,m}}+\displaystyle \frac{{\displaystyle \sum }_{b}({I}_{b,1}+{I}_{b,2})}{{\displaystyle \sum }_{b,m}{I}_{b,m}}\right).\end{array}\end{eqnarray} \tag{ C.4 }$

Note that this value of q is only used to help present and understand our results for families of measurements with varying values of q, but is not needed to calculate the noises which are computed directly from the counts I_a,m and I_b,m.

Typical examples of the counts ${I}_{a,m}$ and I_b,m required to determine the entropic noises $N({ \mathcal M },A)$ and $N({ \mathcal M },B)$ are plotted in figures C1 and C2. The counts shown in figures C1(a) and C2(a) are proportional to the joint probabilities p(a, m) as well as the conditional probabilities $p(m| a)$ (see equation (C.1)). As a result, when the measurement corresponds to a projective measurement in direction ${\vec{r}}_{1}=\vec{a}$ (ϑ₁ = 0 in figure C1(a)), outcome m = 1 occurs with probability almost 1 (equal in the ideal case). In figure C1(a), as ϑ₁ is increased, the measurement becomes noisy and outcome m = 2 becomes more probable. In figure C2(a) when q ≃ 1 the measurement is projective along a direction ${\vec{r}}_{1}$ very close to $\vec{a}$ , while as q is decreased towards 0, ${P}_{\pm }({\vec{r}}_{1})$ is mixed with ${P}_{\pm }({\vec{r}}_{2})$ and outcomes 3 and 4 become more probable as the noisy measurement in direction ${\vec{r}}_{2}$ (close to $\vec{b}$ ) is performed more often by the POVM. The counts I_b,m are interpreted similarly, except that, since the eigenstates of A and B are not aligned, the corresponding distributions are never simultaneously peaked with respect to both observables.