Maximum-likelihood coherent-state quantum process tomography

Coherent-state quantum process tomography (csQPT) is a method of completely characterizing a quantum-optical"black box"by probing it with coherent states and performing homodyne measurements on the output [M. Lobino et al, Science 322, 563 (2008)]. We present a technique for csQPT that is fully based on statistical inference, specifically, quantum expectation-maximization. The method relies on the Jamiolkowski isomorphism and iteratively reconstructs the process tensor in the Fock basis directly from the experimental data. This approach permits incorporation of a priori constraints into the reconstruction procedure, thereby guaranteeing that the resulting process tensor is physically consistent. Furthermore, our method is easier to implement and requires a narrower range of coherent states than its predecessors. We test its feasibility using simulations on several experimentally relevant processes.


Introduction
The art of determining states of quantum systems -quantum tomography -relies on performing measurements over multiple copies of the state in various bases, followed by reconstruction of the state's density matrix using suitable algorithms on the procured data. Methods of state tomography can be extended to the quantum version of the "black box" problem [1,2,3], giving rise to quantum process tomography (QPT). In QPT, measurements on the black box response to a certain set of probe states allow one to predict the effect of that black box on any arbitrary state within a given Hilbert space. QPT emerged in response to ever-increasing demands in the field of quantum information processing, as the assembly of any quantum information processor requires precise knowledge of each of its components [4].
A popular approach to QPT involves determining the output E(ρ i ) for each state of a spanning set {ρ i } of the space of density matrices over the Hilbert space of interest. Due to the linearity of quantum processes over its density operators, the output of any arbitrary stateρ = i c iρi can then be found as E(ρ) = i c i E(ρ i ).
This approach has recently been extended to the continuous-variable domain of quantum optics [5]. The reconstruction procedure involves probing the process with coherent states, i.e. simple laser pulses. It relies on the ability of the Glauber-Sudarshan P representation to express the density matrix of any quantum state as a linear combination of coherent states' density matrices. Improvements in the algorithm have been presented in [6]. The algorithm has been tested in an experiment on characterizing quantum-optical memory [7]. Similar principles have recently been used to perform characterization of quantum optical detectors [8,9]. This method, known as coherent-state QPT, or csQPT, has the advantage of employing only the easy-to-prepare coherent states for probing. However, the numerical reconstruction procedures employed in [5,6] involve an intermediate step of determining the density matrices of the output states E(|α α|) for each probe coherent state |α and subsequent integration with the P function. This approach requires a multistep calculation and does not guarantee to yield a process that is physically plausible, i.e. completely positive and trace non-increasing.
We present a reconstruction scheme that does away with this intermediate step, and reconstructs the process directly from the experimental data using pure statistical inference. The experimental setup is equivalent to that of [5] and is illustrated in figure 1. The process reconstruction algorithm, on the other hand, is entirely different: it relies on the iterative maximum-likelihood approach. Its major advantage is the possibility to incorporate a priori constraints in the reconstruction procedure in order to ensure physically consistent and meaningful results.
Maximum-likelihood methods have been successfully used in the past for quantum state estimation as well as QPT [4,10,11]. However, their role in QPT has been limited to the discrete variable state space. The technique presented in this paper extends the purview of maximum-likelihood QPT to the continuous variable state space, thereby allowing physically consistent quantum process estimation through homodyne tomography experiments [12]. A further advantage of the present technique is the need of a significantly narrower range of coherent states to probe the process as compared to [5,6]. We test our approach on a number of processes that are relevant to quantum optical information processing: identity, attenuation and photon creation. In doing so, we elaborate a number of recommendations for practical use of the method.

Iterative process estimation using Jamiolkowski isomorphism
The process reconstruction scheme presented in this paper is based on application of a maximum-likelihood based QPT scheme [4,11] to quadrature measurements in the Hilbert space associated with a harmonic oscillator. Consider a quantum optical process E acting upon an optical mode prepared in some quantum stateρ m . The positivity of density matrices deems it necessary that E be a completely positive (CP) map, in addition to being trace non-increasing [13]. The output state E(ρ m ) of such a process can be subjected to optical homodyne measurements of its field quadratureŝ x θ =x cos θ +p sin θ, wherex andp are the canonical position and momentum operators and θ is the local oscillator phase. For the output of the probeρ m , the probability of detecting a specific quadrature value x for a phase θ is given by whereΠ(θ, x) = |θ, x θ, x| is the projector associated with the quadrature eigenstate |θ, x and the superscript m on the left hand side denotes the probe state index. The above expression can be considered as a probability distribution function with E as the parameter. If one performs N measurements for each of the M input probe stateŝ ρ m , obtained as a set of phase and corresponding quadrature values {θ i,m , x i,m } where 1 ≤ i ≤ N and 1 ≤ m ≤ M, one can obtain the log-likelihood functional as This functional is convex over the space of CP maps [14]. The objective of maximumlikelihood estimation is to determine the parameter E est that is as close to the actual parameter as possible, by maximizing the likelihood functional L(E) over the space of CP maps This optimization problem is not straightforward and has been handled previously through various methods such as the uphill simplex [15]. However, a more rigorous yet technically simpler approach involves the formulation of an extremal equation that maximizes the log-likelihood functional given in equation (2).
In order to carry out the reconstruction procedure, one needs to first select a certain basis for the representation of the process and the relevant operators. In the Fock (number state) basis, the quantum process can be represented by a rank-4 tensor that relates the density matrix of the input and output states as [5,6] [ where E mn jk = j|E (|m n|) |k and ρ mn = m|ρ|n . Although the optical Hilbert space is of infinite dimension, in practical process tomography it is truncated to the spanning set of several lowest Fock states, as will be discussed later. Also, the projectorsΠ(θ, x) can be expressed in this basis as Π mn (θ, x) = m|Π(θ, x)|n = m|θ, x θ, x|n , where the overlap of the quadrature eigenstate with the number state is given by [10,16] m|θ, x = e imθ 1 With the selected basis, we proceed to formulating a numerical procedure for the reconstruction of the quantum process. For a concise mathematical visualization, we resort to the Jamiolkowski isomorphism between linear CP maps E from operators on the Hilbert space H to the space K and positive semidefinite operatorsÊ on the Hilbert space H ⊗ K. The explicit relation betweenÊ and E is given as [14] E = m,n,j,k E mn jk |m n| ⊗ |j k| .
With the definition in equation (7), the outputρ out ∈ K of a process E for an input ρ in ∈ H isρ where T denotes transposition in the number basis. In addition, one must apply the trace-preservation condition (Tr[ρ out ] = Tr[ρ in ]) over the process E, which yields The reconstruction procedure can be extended to also encompass trace non-preserving processes, as will be shown subsequently. The problem has thus reduced to the determination of (dimHdimK) 2 parameters subject to dimH 2 constraints. When the input and output Hilbert spaces are identical, this amounts to evaluating dimH 4 −dimH 2 free parameters.
For the process output of the input probe stateρ m , the probability of reading a quadrature value x for a given local oscillator phase θ can be obtained by substituting (8) OperatorÊ should then maximize a constrained log-likelihood functional in order to stand as the most likely quantum process that has the set of outcomes {θ i,m , x i,m } for the input probes {ρ m }. The relevant log-likelihood functional is given as whereΛ =λ⊗Î K andλ is the Hermitian matrix of Lagrange multipliers that incorporates the trace preservation condition (9). Again, θ i,m and x i,m belong to the set of quadrature data for the m th probe state given by {θ i,m , x i,m }. An extremal equation can be obtained by varying equation (11) with respect toÊ: which gives This holds for all δÊ, so that the expression in the parentheses can be equated to zero and one hasÊ Owing to Hermicity, one may rewrite equation (14) asÊ =ÊRΛ −1 . Using this, along with equation (14), we arrive at Λ can be determined by substituting the expression forÊ in equation (16) into the trace-preservation condition (9): Equations (16) and (17) can be solved numerically through iterations, starting from an unbiased initialÊ, such asÊ (0) =Î H⊗K /(dimK). At each step of the iterations, the positive semi-definiteness ofÊ is ensured and the constraint Tr K [Ê] =Î H is satisfied. Quantum processes may also be probabilistic, in which case the trace of the input quantum state is not preserved. The probability of occurrence of a probabilistic quantum process is given by The reconstruction of probabilistic quantum processes can be viewed as a reconstruction of a trace-preserving, deterministic CP mapẼ if the failure of the process is taken to be a measurement event associated with the projection operatorΠ ∅ onto a fictitious state |∅ [14]. In order to analyze such a process, one can extend the Hilbert space to form K total = K ⊕ K fail , where K fail is spanned by the single state |∅ . The original set of projectorsΠ θ (x) for each θ is augmented by addingΠ ∅ so that the new set of projectors satisfies the closure relation over K total , i.e. ∀θ Π θ (x)dx +Π ∅ = I. Subsequently, the likelihood functional, with the extended trace-preserving mapẼ as parameter, can be rewritten as where g m is the fraction of successful events over total events, which can be determined experimentally. The extremal equation would then contain a modified operatorR given byR Iterations can now be performed with the newR to obtain the trace-preserving process tensorÊ. The actual process tensorÊ is obtained by taking the projection of the estimated tensorÊ onto the subspace H ⊗ K. Our analysis so far did not specify which states were to be used as probes; the only requirement is that these states compose a spanning set in the space of density matrices. In csQPT, the role of probe states is played by coherent states [5]. The density operator of an arbitrary state can be written as a linear combination of coherent state density operators using the optical equivalence theorem: where Pρ(α) is the Glauber-Sudarshan P function of stateρ. Using the linearity of quantum processes with respect to density matrices, the process output is then given by Therefore, if the response of the quantum system to all coherent states is known, the output of any arbitrary unknown quantum state can be computed. In other words, measurements on the set of responses E(|α m α m |) for coherent states |α m provides tomographically complete information about the quantum process.

Practical issues
We now proceed to discussing a few practical issues arising in the implementation of the above algorithm of csQPT. The first issue is associated with infinite dimension of the optical Hilbert space. In practical implementation of csQPT, the process tensor is reconstructed for a subspace H(n max ) of the Hilbert space spanned by Fock states up to a certain cut-off value, n max . The choice of n max is correlated with the maximum amplitude α max of the set of coherent probe states. Given a data set with a specific α max , the choice of n max depends on many factors, in particular, the process itself (see supplementary online material to [5]). For the iterative cycle, the cut-off value must be chosen sufficiently high so that H(n max ) accommodates all of the coherent probe states and the associated output states. Otherwise, the probe states and the quadrature data will be inadequately represented by H(n max ). This will lead to inaccurate reconstruction of the process tensor; we refer to this phenomenon as truncation errors.
For physically realistic processes, we expect the fractions of |α m and E(|α m α m |) that lie outside the reconstruction subspace to vanish as n max tends to infinity. Hence, for a given α max , it is possible to choose a value of n max such that the associated truncation errors are arbitrarily low [17].
However, a high cut-off value may give rise to another class of inaccuracies, which we call data insufficiency errors. If the overlap of a given Fock state |n with all of the |α m is low, so is the contribution of |n to the log-likelihood functional, and hence the available data will not provide sufficient information about the effect of the process on |n . In contrast to the truncation errors, the data insufficiency errors grow with n max , but only apply to the process tensor elements associated with high input photon numbers.
Therefore the following dual-step procedure for the choice of the cut-off may be necessary. The initial value of n max must be sufficiently high to ensure absence of truncation errors. Subsequently, after the iterative cycle has been completed, we choose a secondary cut-off value, n ′ max ≤ n max , and remove all the process tensor elements containing indices above n ′ max . The choice of n ′ max can be determined by calculating the statistical errors associated with each process tensor element -similar to the error estimations for state tomography [14,18,19,20]. However, further research is required to determine statistical errors for QPT and establish a concrete bound for n ′ max . In the next section, we illustrate the effect of the chosen subspace dimension on the process reconstruction through various simulations.
As in the case of state tomography [10], our algorithm permits automatic correction for optical losses and inefficient detectors in the process tensor reconstruction. In order to account for non-unitary efficiency η, the projection operators are replaced bŷ Π η (θ, x) = m,n,j,k B n+k,n (η)B m+k,m (η) m|Π(θ, x)|n × |m + k n + k| , where B n+k,n = n+k k η n (1 − η) k 1/2 . Substituting this into equation (15) and performing the iterations generates the original process tensor pertaining to the case of ideal detection.
Many physically relevant processes are phase-invariant: applying an optical phase shift to the input state results in the same shift to the output. Mathematically, such processes satisfy the following relation [6,10] In this case, further simplifications can be made. If the action of the process on a coherent state |α is known, so is the outcome for |αe iφ . Therefore, one needs to only perform measurements for input coherent states with amplitudes on the positive real axis. When condition (24) is applied in the Fock basis, the elements of the process tensor E mn jk for a phase-invariant process vanish except when m − n = j − k. This condition is incorporated into the probability distribution (10) as where M denotes a masking operation overÊ. IfΠ m = |m m| denotes a projection operator in the number basis, then M(Ê) can be expressed as Since the trace operation is invariant under cyclic rearrangements of the operators and the Kronecker delta is invariant under transposition of indices, the probability distribution (25), and consequently, the expression for the operatorR in equation (15) changes to When the above relations are used, the elements of E mn jk , for which m − n = j − k, vanish, resulting in the incorporation of the phase invariance condition.
In some cases, the value of the log-likelihood oscillates before converging to the maximum owing to overshoots. Stabilization can be achieved using the diluted algorithm that slows down but guarantees convergence [21]. The operatorR, in that case, is modified to a weighted sum of itself and the identity operator aŝ where 0 ≤ µ ≤ 1. As the value of µ decreases, the algorithm becomes more and more dilute, resulting in increased stability but a reduced rate of convergence. In addition, monotonic increase of the likelihood is guaranteed for small values of µ (see Appendix A). One may try to find the optimal value of µ that maximizes the increase in likelihood at the cost of increased computational complexity. Gradually varying the value of µ during the iterations may be justified for some processes. The number of quadrature measurements for each probe state typically ranges in tens of thousands. With multiple probe states, the iteration cycle may require significant computation time. In order to speed up the computation, binning of the data points in the quadrature and phase axes may be useful. A suitable step size is chosen for each axis as a trade-off between the desired computational time and the quantization error.
For each E(|α m α m |), quadrature data points are then clubbed into bins with centers {θ u,m , x v,m }. With this modification, the log-likelihood functional (2) now reads as where h m;u,v denotes the number of data points in the bin with center (θ u,m , x v,m ). Ideally, one must obtain the POVM associated with the bin center as a function of all the POVMs lying in the bin. However, given a small size of the bin, this element can be approximated by projection onto the quadrature value at the center of the bin. Similarly, the operatorR in equation (15) can be rewritten aŝ For further speedup, one can computeR in a parallel fashion on different threads owing to absence of interdependency in the summation procedure.
In practical experiments on probabilistic processes, the frequency of successful events can be low: g m ≪ 1. In this case, the process tensor elements of interest (i.e. those related to K) will be small and thus suffer from increased relative error. This issue can be resolved by rescaling the values of all g m by the same factor for all probe states, keeping in mind the requirement that g m < 1 for all m. Physically relevant elements of the process tensor will then rescale by the same factor, reducing the relative error.

Implementation and results
In order to test the algorithm, we have implemented it using Matlab and studied the reconstruction of a few quantum processes using simulated data. Theoretical process tensors of identity, attenuation and photon creation [6] were used to find the marginal probability distribution functions for various probe states using equation (10). From the marginals, we generated synthetic experimental data through Monte-Carlo simulations.
Each process was applied to four coherent probe states with α m ranging from 0 to 0.9375 in steps of 0.3125. For each input probe state, the output state dataset consisted of 100,000 phase and quadrature points {θ, x}. This set of data was subjected to the reconstruction method described. The iterations were halted when the change in process  (c) show process tensors for theoretical identity, attenuation (by factor 0.9) and photon creation processes, respectively. (d), (e) and (f) show the process tensors for the corresponding reconstructed processes using the algorithm presented in this article. The photon creation process tensor has been scaled to match the theoretical one (i.e. g = 1). tensor elements was insignificant over a large number of iterations. However, a better approach would be to set a threshold for the increase of the log-likelihood [22].
The result obtained by running the reconstruction technique is a 4-dimensional process tensor whose diagonal elements have a simple interpretation. For a given quantum process E, the diagonal element E mm kk denotes the probability that the output contains k photons when the process is subjected to m input photons. A comparison between the diagonal elements of the theoretical and reconstructed process tensor is made in figure 2 and exhibits close match between the two.
The process of photon creationâ † requires additional discussion because it corresponds to a non-unitary, trace non-preserving operator. Therefore, in experimental practice it can only be implemented probabilistically. The optical mode containing the target state |ψ is directed into the signal channel of a parametric down-conversion setup. The state of the down-conversion output in the signal (s) and idler (i) channels can then be written as |ψ s |0 i + g(â † |ψ ) s |1 i , where g is the down-conversion amplitude. Detection of a photon in the idler channel projects the signal state ontoâ † |ψ , thereby heralding a photon addition event [23]. For coherent state input |ψ = |α , the event probability, corresponding to the quantity g m in equation (19), is proportional to |g| 2 (1 + |α| 2 ).
We take the value g 2 = 0.1 during simulations to ensure that success probabilities remain less than 1 for the probe states selected. This makes the photon creation process trace non-increasing and thus physical. Note that the process tensor reported in figure 2(f) has been normalized by dividing by g 2 , so that its scale matches that of the process tensor for the photon creation operatorâ † given in figure 2(c).
The iterative reconstruction of photon creation exhibited relatively poor convergence. Diluted iterations (29) were required in the beginning in order to curb oscillations. However, as the iterations progressed, the rate of increase of the likelihood value became extremely low. We circumvented this issue by implementing the successive over-relaxation technique. Setting µ in equation (29) to slightly over 1 while iterating accelerated the increase in likelihood. As soon as a decrease in the likelihood value was registered due to an overshoot, µ was reset to 1, and then slowly increased again after stabilization. This procedure was applied multiple times until a fair amount of convergence was observed. In a loose sense, the over-relaxation method employs linear extrapolation by selecting a tensor that lies on the line joining the current iterate and the next iterate but is beyond the latter by a fraction. If the iterations happen to proceed in the direction of maximum likelihood gradient, it allows faster convergence by inducing greater leaps. Additionally, it may also help in escaping limit cycles encountered during the iterations.
We have also tested the reconstruction technique for photon creation in the case of inefficient detection. The output density matrices for the probe states were calculated using the beam splitter model of absorption [16]. With these modified density matrices, we have generated test data using Monte Carlo simulations for η = 0.75 and η = 0.55. A comparison of the reconstructed process tensors is given in figure 3.
Finally, we investigated the effect of the dimension of the subspace of optical Hilbert space chosen for the reconstruction, specifically for the photon creation process. In order to eliminate statistical errors in this reconstruction, we directly used the marginal distributions instead of simulated quadrature data sets to obtain the values of h m;u,v in equations (30) and (31). The performance criterion is taken to be the worst-case fidelity  Figure 4: Effect of the photon number cut-off (n max ) on the reconstruction of the photon creation process. (a) Worst-case process reconstruction fidelity as a function of n max for α max = 0.4 (top) and α max = 0.6 (bottom) and different n ′ max . The slight decreases of fidelity with increasing n max and constant n ′ max , observed in some cases, are numerical artefacts. (b) Diagonal values of the process tensor E mm kk for α max = 0.6 and n max = 3. The reconstructed process tensor has significant artefacts due to truncation errors. (c) Diagonal values of the process tensor E mm kk for α max = 0.6 and n max = 8. The reconstructed process tensor elements associated with input photon numbers 6, 7 and 8 are invalid due to data insufficiency (for example, | α max = 0.6|n = 6 | 2 = 2.1 × 10 −6 ).
[24] over the input space H(n max − 1), defined as where E is the actual process tensor and E est is the estimated process tensor. Note that the photon number cutoff for the fidelity calculation is taken to be n max −1 to ensure the Hilbert space closure under photon addition. Minimization over H(n max − 1) is carried out through a Monte Carlo simulation that involves introducing small random changes in the density matrixρ within H(n max − 1) and accepting the change whenever the value of the fidelity decreases. The solid line in figure 4(a) shows the worst-case fidelity versus n max for two values of α max . For each given α max , the fidelity initially increases with n max as the truncation effects subside and decreases afterwards due to data insufficiency. The range of n max , over which the process tensor is reconstructed correctly, shifts towards the higher photon numbers with increasing α max owing to greater contribution of higher photon numbers in probe states of higher amplitudes. Figure 4(b,c) further illustrates the two types of errors associated with the choice of the cut-off point. If n max is chosen too low ( figure 4(b)), truncation errors compromise the entire reconstructed process. If the reconstruction subspace is sufficient to accommodate all the input probe states and associated output states ( figure 4(c)), only the process tensor elements associated with high input photon numbers are reconstructed incorrectly. In this case, introducing a secondary cut-off at n ′ max = 5 is justified.
The dashed lines in figure 4(a) display the advantages of the dual cut-off approach introduced in section 2.2. For example, with α max = 0.6, optimal reconstruction is attained with the initial cut-off point n max ≥ 6 and subsequent cropping of the process tensor with n ′ max = 5. With this approach, the worst-case reconstruction fidelity is higher than for all cases with n ′ max = n max . Note, however, that in most examples we studied, the dual cut-off method offers only a small advantage and may not be justified in practical csQPT.
As evidenced by figure 4(a), the secondary cut-off points should be chosen close to n ′ max = 3 and 5, for α max = 0.4 and 0.6, respectively. These values are much higher than those calculated in the supplementary information to Ref. [5]. Specifically, the method of Ref. [5] for the same values of n max = 3, 5 would require the maximum coherent state amplitudes of 8 and 12, respectively. Further, the method of Ref. [6] requires 2N probe states for reconstruction in a Fock space of dimension d = N +1, while our method poses no such constraints. In other words, for a given set of probe coherent states (defined by their number and maximum amplitude α max ), the present reconstruction method provides much more information about the process tensor than previous methods.
One must note that further research is needed for the inverse problem of determining the optimum α max for a chosen Fock space dimension d = n max . According to the numerical examples we studied, it is reasonable to choose α max such that α max |n max 2 ≈ 1/N where N is the total number of quadrature measurements.

Summary
We have presented a maximum-likelihood based experimental data processing technique for the tomographic reconstruction of quantum optical processes. This technique relies on measuring the response of the process to various coherent probe states through optical homodyne tomography. The reconstruction applies directly to the obtained data, unlike the previous coherent state QPT methods that involve intermediate reconstruction of density matrices of the output states. The range of probe states required for reconstruction has also been reduced. Complete positiveness and trace preservation/non-increase conditions are incorporated in the estimated process tensor by imposing a priori constraints, thus yielding physical results. The simplicity and robustness of this technique make it appealing for quantum process estimation, with applications extending to optical quantum computing and quantum communication.