Extracting the physical sector of quantum states

The physical nature of any quantum source guarantees the existence of an effective Hilbert space of finite dimension, the physical sector, in which its state is completely characterized with arbitrarily high accuracy. The extraction of this sector is essential for state tomography. We show that the physical sector of a state, defined in some pre-chosen basis, can be systematically retrieved with a procedure using only data collected from a set of commuting quantum measurement outcomes, with no other assumptions about the source. We demonstrate the versatility and efficiency of the physical-sector extraction by applying it to simulated and experimental data for quantum light sources, as well as quantum systems of finite dimensions.


Introduction
The physical laws of quantum mechanics ensure that all experimental observations can be described in an effective Hilbert space of finite dimension, to which we shall refer as the physical sector of the state. The systematic extraction of this physical sector is crucial for reliable quantum state tomography.
Photonic sources constitute an archetypical example where such an extraction is indispensable. Theoretically, the states describing these sources reside in an infinite-dimensional Hilbert space. Nonetheless, the elements of the associated density matrices decay to zero for sufficiently large photon numbers, so that there always exists a finite-dimensional physical sector that contains the state with sufficient accuracy. Reliable state tomography can thus be performed once this physical sector is correctly extracted.
Experiments on estimates of the correct physical sector have been carried out [1,2].One common strategy is to make an educated guess about the state (such as Gaussianity [3] or rank-deficiency for compressed sensing [4][5][6][7][8][9][10]), which defines a truncated reconstruction subspace. For instance, in compressed sensing the rank of the state is assumed to be no larger than a certain value r, so that specialized rank-r compressed-sensing measurements can be employed to uniquely characterize the state with much fewer measurement settings. Very generally, educated guesses of certain properties of the state requires additional physical verifications. Algorithms for statistical model selection, such as the Akaike [11][12][13] or Schwarz criteria [14,15] or the likelihood sieve [16,17], have also been developed to estimate the physical sector. These algorithms provide another practical solution to reducing the complexity of the tomography problem. In the presence of the positivity constraint [18,19], their application to quantum states becomes more sophisticated, as the procedures for deriving stopping criteria that supplies the final appropriate model subspace for the unknown state are intricate.
On the other hand, finite-dimensional systems represent another example for which a systematic physicalsector extraction becomes important. In the context of quantum information, ongoing developments in Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
dimension-witness testing [20][21][22][23][24] offer some solutions to finding the minimal dimension of a black box required to justify the given set of measurement data in a device-independent way. Searching for dimension witnesses of arbitrary dimensions is still challenging [23].
In [25], we showed that, when the measurement device is calibrated, one can systematically extract the physical sector (that is, both the Hilbert-space support and dimension) and simultaneously reconstruct any unknown state directly from the measurement data without any assumption about the state. In this paper, we introduce an even more efficient procedure that extracts the physical sector of any state from the data without state reconstruction and provide the pseudocode. This procedure requires nothing more than data obtained from a set of commuting measurements. As in [25], the extraction of the physical sector does not depend on any other assumptions or calibration details about the source. By construction, this procedure has a linear complexity in the dimension of the physical sector. To showcase its versatility, we apply it to simulated and experimental data for photonic sources and systems of finite dimensions. In this way, we offer a deterministic solution to the problem of extracting the correct physical sector for any quantum state in measurementcalibrated situations.
2. Physical sectors and commuting measurements 2.1. What are physical sectors? The concept of physical sectors and their relations to commuting measurements is probably best understood with a concrete example. Let us consider, in the Fock basis, a quantum state of light described by the density operator where * denotes elements of its density matrix that are so tiny that treating them to be zero incurs very small truncation errors. If all * = 0, ρ is the pure state ñá | | described by a a ñ µ ñ + -ñ | | | , with the coherent state of amplitude a = 0.3536. The density matrix elements drops to zero for sufficiently large photon numbers as those of any physical state.
Some statistical reasoning for understanding the truncation error is in order. For now, we note that since all other * elements are tiny, the state ρ is essentially fully characterized by a three-dimensional sector, such that elements beyond this sector supply almost no contribution to ρ. This forms a truncated Hilbert subspace where tomography can be carried out reliably. This subspace is given by  = ñ ñ ñ {| | | } span 0 , 1 , 2 sub . However, from (2.1), we realize that this subspace is not the smallest one that supports ρ. The smallest subspace  = ñ ñ {| | } span 0 , 2 phys is in fact spanned by only two basis kets. This defines the two-dimensional physical sector. In general, the physical sector  phys is defined to be the smallest Hilbert subspace that fully supports a given state with a truncation error smaller than some tiny e in some basis. Evidently, the choice of basis affects the description of  phys . If one already knows that ρ is close to ñá | |, then choosing ñ | as part of a basis gives a onedimensional  phys . Such knowledge is of course absent when ρ is unknown. In such a practical scenario in quantum optics, we may adopt the most common Fock basis for representing ρ and  phys . When dealing with general quantum systems, the basis that is most natural in typical experiments may be chosen, such as the Pauli computational basis for qubit systems.

2.2.
How are physical sectors related to commuting measurements? Let us revisit the example in (2.1). Because of the positivity constraint imposed on ρ, whenever a diagonal element is * , then elements in the row and column that intersect this element are all * . Also, if a diagonal element is not * , then it is obvious that  phys is spanned by the basis ket for this diagonal element. For this example, the two-dimensional  phys completely characterizes ρ with the = 2 4 2 elements ρ 00 , ρ 22 , r ( ) Re 02 and r ( ) Im 02 . It follows that knowing the location of significant diagonal elements are all we need to ascertain  phys . For this purpose the only necessary tool is a set of commuting measurement outcomes with their common eigenbasis being the pre-chosen basis for  phys . After the measurement data are performed with these commuting outcomes, all one needs to do is perform an extraction procedure on the data to obtain  phys . This procedure would proceed to test a growing set of basis kets until it informs that the current set spans  phys that fully supports the data. We note here that the extraction works for any other sort of generalized measurements in principle, although we shall consider commuting measurements in subsequent discussions since they are the simplest kind necessary for extracting physical sectors in large Hilbert-space dimensions.

The extraction of the physical sector
In some pre-chosen basis, the physical-sector extraction procedure (PSEP) iteratively checks whether its data are supported by the cumulative sequence of  sub with truncation error smaller than some tiny e. PSEP starts deciding whether, say,  = ñ ñ {| | } n n span , sub 1 2 of the smallest dimension d=2 adequately supports the data. If yes, it takes this as the two-dimensional  phys . Otherwise, PSEP continues and decides if 3 adequately supports the data, and so on until finally PSEP assigns a d phys -dimensional   = sub phys with some statistical reliability. In each iterative step, there are three objectives to be met: (i) PSEP must decide if the data are supported with  sub spanned by some set of basis kets or not.
(ii) PSEP must report the reliability of the statement ' sub supports ρ with truncation error less than ò'.
(iii) PSEP must ensure that the final accepted set of basis kets span  phys , the smallest  sub that supports ρ.
In what follows, we show that all these objectives can be fulfilled with only the information encoded in the measurement data.
3.1. Deciding whether the data are supported with some subspace We proceed by first listing a few notations. In an experiment, a set of measured commuting outcomes are described by positive operators å P = 1 j j . They give measurement probabilities r = P ( ) p tr j j according to the Born rule. Each commuting outcome, in the common eigenstates ñá | | n n that are also used to represent the physical sector, can be written as with positive weights c jl that characterize the outcome.
To decide whether the p j s are supported with some Hilbert subspace  sub , the easiest way is to introduce Hermitian decision observables for real parameters y j . The decision observable for testing  sub , along with its y j s, satisfies the defining property, This property automatically ensures that if ρ is completely supported in  sub , then the expectation value á ñ = å = W yp 0 j j j sub with zero truncation error and PSEP takes this to be the physical sector (  = sub phys ). Quantum systems of finite dimensions possess states of this kind. In quantum optics however, ρ is not completely supported in any subspace, but possesses decaying density-matrix elements with increasing photon numbers (such as the example in (2.1)). A laser source, for instance, cannot produce light of an infinite intensity. Furthermore, the Born probabilities p j are never measured. Instead, the data consist of relative frequencies f j that estimate the probabilities with statistical fluctuation. Therefore, if we define the decision random variable (RV) phys with a truncation error defined by | | w sub that is smaller than ò.

Quantifying the reliability of the truncation error report
The decision RV w sub is an unbiased RV in that the data average of w sub is the true value á ñ W sub that PSEP achieves we are assured with α significance that the main factor for a non-zero | | w sub comes from insufficient support from  sub since statistical fluctuation is heavily suppressed.
One can obtain the more experimentally friendly inequality [26] in terms of the variance D 2 of w sub , where we take e » | | w sub as a sensible guide to the truncation error threshold. For  N 1, the N 1 scaling of D 2 allows the quantity B sub to provide an indication on the reliability of the statement ' sub supports ρ with truncation error less than e' with a reasonable statistical estimate for D 2 from the data. If (3.7) holds for  sub and some pre-chosen α, then the assignment   = phys sub is made. Quite generally, w sub and D 2 reveal the influence of both statistical and systematic errors [28]. Therefore, by construction, for sufficiently large N,  sub eventually converges to the unique  phys at α significance with increasing size of the basis set for properly chosen  sub . The choice of  sub at each iterative step of PSEP must be made so that the final extracted support is indeed  phys , the smallest support for ρ.

Ensuring that the physical sector is extracted, not another larger support
To ensure that  phys is really extracted, and not some other larger  sub that also supports the data, we once more return to the example in (2.1). For that pure state, in the Fock basis, the  sub that supports the state is effectively three-dimensional, whereas  phys is effectively two-dimensional. With sufficiently large number of detection events N, if one naively carries out PSEP starting from  = ñ {| } span 0 sub , PSEP would recognize that  sub cannot support the data, continue to test the next larger subspace  = ñ ñ {| | } span 0 , 1 sub , where it would again conclude insufficient support. Only after the third step will PSEP accept  = ñ ñ ñ {| | | } span 0 , 1 , 2 sub as the support at some fixed α significance. However,   ¹ sub phys . In order to efficiently extract  phys , we need only one additional clue from the data, that is the relative size of the diagonal elements of ρ. We emphasize here that we are not interested in the precise values of the diagonal elements, but only a very rough estimate of their relative ratios to guide PSEP. With this clue, we can then apply PSEP using the appropriately ordered sequence of basis kets to most efficiently terminate PSEP and obtain the smallest possible support for the data. For the pure state example, the decreasing magnitude of the diagonal elements gives the order ñ ñ {| | } 0 , 2 . For any arbitrary set of commuting P j s, given the measurement matrix C of coefficients c jl , sorting the column -C f , defined by the Moore-Penrose pseudoinverse -C of C, in descending order suffices to guide PSEP 9 . This sorting permits the efficient completion of PSEP in ( ) O d phys steps without doing quantum tomography. Other sorting algorithms are, of course, possible without any information about the diagonal element estimates. One can perform other tests on different permutations of basis kets within the extracted Hilbert subspace support, although the number of steps required to complete PSEP would be larger than ( ) O d phys .

An important afterword on physical-sector extraction
An astute reader would have already noticed that it is the  phys within the field-of-view (FOV) of the data that can be reliably extracted. The FOV is affected by three factors: the degree of linear independence of the measured outcomes, the choice of some very large subspace to apply PSEP whose dimension does not exceed this degree of linear independence, and the accuracy of the data (the value of N). In real experiments, the number of linearly independent outcomes measured is always finite. With the corresponding finite data set, there exists a large subspace for extracting  phys , in which the decision observables W sub always satisfy (3.3) for any  sub . For sufficiently large N, the collected data will capture all significant features of  phys within this data FOV.
Indeed, if the source is truly a black box, then defining the data FOV can be tricky. True black boxes are, however, atypical in a practical tomography experiment since it is usually the observer who prepares the state of the source and can therefore be confident that the state prepared should not deviate too far from the target state as long as the setup is reasonably well-controlled. The data FOV should therefore be guided by this common sense. On the other hand, the extraction of  phys in device-independent cryptography, where both the source and measurement are completely untrusted for arbitrary quantum systems, is still an open problem.
We note here that the measurement in (3.1) may incorporate realistic imperfections, such as noise, finite detection efficiency, that are faced in a number of realistic schemes. For instance, the commuting diagonal outcomes may represent on/off detectors of varying efficiencies, or incorporate thermal noise [29,30]. All such measurements are presumed to be calibratable, as non-calibrated measurements require other methods to probe the source. As an example, suppose that the measurement is inefficient but still trustworthy enough for the observer to describe its outcomes by the set h P { } j j with unknown inefficiencies h < 1 j that are simple functions of a few practical parameters of the setup such as transmissivities, losses and so forth. In other words, we have for l that is typically much less than the total number of outcomes in practical experiments. Then the straightforward practice is to first calibrate all T j s before using them to subsequently carry out PSEP for other sources. One may also choose to calibrate T j already during the sorting stage by 'solving' the linear system j is now linear in the data f j and nonlinear in T j . The estimation of T j falls under parameter tomography that is beyond the scope of this discussion, which focuses on the idea of locating physical sectors and not the exact values of density matrices.

The pseudocode for physical-sector extraction
Suppose we have a set of commuting measurement data { } f j that form the column f , as well as the associated outcomes P j of some eigenbasis ñ ñ ñ ¼ that is adopted to represent  phys . For some pre-chosen basis and α significance, the pseudocode for PSEP is presented as follows: step 1. Compute the measurement matrix C and sort -C f in descending order to obtain the ordered index i. Then, define the ordered sequence of basis kets , ,  . step 5. Increase k by one and include ñ |n i k in  sub . step 6. Repeat STEP3 through 5 until  a B sub . Finally, report   = phys sub and α and proceed to perform quantum-state tomography in  phys .
Data statistical fluctuation may be further minimized by averaging B sub over many different sets of commuting outcomes. Moreover, one can detect additional systematic errors that are not attributed to truncation artifacts by inspecting the corresponding histograms for errors larger than the statistical fluctuation. . 2000 random sets of 40 commuting measurement outcomes were used to calculate the average B sub in every iterative step k. The (blue) histogram plots B sub for the default ordering of the basis kets labeled with = ¼ n 0, 1, 2 . The physical sector  phys (yellow region) is revealed after completing PSEP with respect to a 5% significance level (a = 0.05) (red solid line).
We next proceed to experimentally validate PSEP by measuring photon-click events of a time-multiplexed detector (TMD). We use a fiber-integrated setup to generate and measure a mixture of coherent states, as depicted in figure 2(a). Coherent states are produced by a pulsed diode laser with 35ps pulses at 200kHz and a wavelength of 1550nm. These pulses are then modulated with a telecom Mach-Zehnder amplitude modulator, driven with a square-wave signal at 230kHz. This produces pseudorandom pulse patterns with two fixed amplitudes. After passing through fiber-attenuators, the state is measured with an eight-bin TMD [31,32] with a bin separation of 125ns and two superconducting nanowire detectors. We record statistics of all possible 2 8 bin configurations, which corresponds to a total of 256 TMD outcomes.
To characterize the TMD outcomes for the measurement, we perform standard detector tomography, using well calibrated coherent probe states [33,34]. The setup is similar to the previous one, but we replace the modulator by a controllable variable attenuator. We calibrate the attenuation with respect to a power meter at the laser output. This allows us to produce a set of 150 probe states with a power separation of 0.2dB.
TMD data of a statistical mixture of two coherent states are collected and PSEP is subsequently performed on these data. The accuracy of the extracted physical sector is ultimately sensitive to experimental imperfections. In this case, these imperfections are minimized owing to the state-of-the-art superconductor technology, the fruit of which is a histogram that is as clean as it gets in an experimental setting. Figure 2(b) provides convincing evidence of the feasibility and practical performance of the technique, where real data statistical fluctuation is present. This physical sector may subsequently be taken as the objective starting point for a more detailed investigation of the quantum signal with tools for tomography and diagnostics.

Finite-dimensional quantum systems
To analyze another aspect of PSEP, in this section, we apply it to quantum systems of finite dimensions with discrete-variable commuting measurement outcomes. As a specific example, we consider the arrangement in [22], which uses single photons to encode the information simultaneously in horizontal (H) and vertical (V ) polarizations, and in two spatial modes (a and b). We define four basis states: , . On passing through three suitably oriented half-wave plates at angles q 1 , q 2 , and q 3 , the state of such hybrid systems can be converted to the pure state r q q q q q q = ñ á | | , , , Thus, by adjusting the orientation angles of the wave plates, one could produce qubits, qutrits or ququarts from such a hybrid source. Here, we show that PSEP can rapidly extract  phys by inspecting only the data measured from a set of commuting quantum measurements. Figure 3 presents the plots for a qubit, qutrit and ququart system characterized by the different (q q q , , 1 2 3 ) configurations. We have thus shown that in the typical experimental scenarios where the measurement setup is reasonably well calibrated, and hence trusted,  phys can be systematically extracted within the subspace spanned by the measurement outcomes. This allows an observer to later probe the details of the unknown but trusted quantum source using only the data at hand. Notice that the relevant basis states, labeled by n, form a basis for the commuting measurement on the black box. As such, this procedure is not a bootstrapping instruction. Rather, it systematically identifies the correct  phys without any other ad hoc assertions about the source. In this way, we turn PSEP into an efficient deterministic dimension tester with complexity ( ) O d phys , as we have already learnt from section 3.3.

Conclusions
We have formulated a systematic procedure to extract the physical sector, the smallest Hilbert subspace support, of an unknown quantum state using only the measurement data and nothing else. This is possible because information about the physical sector is always entirely encoded in the data. This extraction requires only few efficient iterative steps of the order of the physical-sector dimension.
We demonstrated the validity and versatility of the procedure with simulated and experimental data from quantum light sources, as well as finite-dimensional quantum systems. The results support the clear message that, for well calibrated measurement devices, the physical sector can always be systematically extracted and verified with statistical tools, in which quantum-state tomography can be performed accurately. No a priori assumptions about the source, which require additional testing, are necessary. The proposed method should serve as the reliable solution for realistic tomography experiments in quantum systems of complex degrees of freedom. Figure 3. PSEP for hybrid quantum systems of finite dimensions that potentially generates either (a) a qubit state, (b) a qutrit state or (c) a ququart state according to equation (5.1). With =Ń 2.5 10 6 detection events, all three physical sectors (yellow region) are correctly extracted. For the ququart, the slightly higher reordered  B sub bar at n=2 (which goes to zero for larger N) is a manifestation of the favorable sensitivity of the procedure to specific quantum-state features, not just the overall physical sector.