Quantum proportional-integral (PI) control

Hui Chen; Hanhan Li; Felix Motzoi; Leigh Martin; K Birgitta Whaley; Mohan Sarovar

doi:10.1088/1367-2630/abc464

1. Introduction

The maturation of quantum technologies relies heavily on the development of advanced quantum measurement and control solutions. For this purpose, many concepts and solutions developed in classical control theory and practice can be carried over to the quantum domain. Recent examples of useful application of classical control concepts in the context of quantum systems are Lyapunov control [1–5], LQG control [6–10], risk-sensitive control [11–13], and filtering and smoothing for estimation and control [14–25].

Feedback control is particularly important for applications such as error correction, cooling, and stabilization of quantum systems. Feedback becomes most interesting when the control signals can be applied to a quantum system at timescales that are comparable to the timescale of the measurement. In this case, one must model the effects of intrinsic time evolution, measurement (including quantum backaction) and feedback control all at the same time, which results in interesting and complex dynamics. This typically leads to a description in terms of a continuous-in-time stochastic dynamical equation for the density matrix of the quantum system. The simplest type of feedback, in which the feedback operation is directly proportional to the measurement signal at the same time, leads to Markovian evolution of the system [26, 27]. This proportional feedback (often termed 'direct' feedback) has been applied in theoretical analysis of many problems including state stabilization and cooling [28–32], quantum error correction [33–35], state purification [36, 37] and generation of entangled states [38–41] and squeezed states [42], and has also been experimentally demonstrated [43–46]. Recent work has extended quantum feedback control beyond proportional feedback to implementations based on estimation of the quantum state [47–49], implementations using stochastic noise sources [50], and to implementations using the most general form of feedback that does not include a time-delayed proportional term [51]. In the latter framework, referred to as proportional and quantum state estimation (PaQS) feedback, the feedback operator can equivalently be expressed as a sum of independent deterministic and stochastic contributions. This approach has also been extended to multiple measurement and feedback operators [41]. In several instances, locally optimal feedback laws have been derived [6, 7, 39, 51–54], with global optimality being shown in a smaller number of cases [39, 53, 54].

As is the case for complex classical systems, the implementation of advanced, and particularly of optimal, feedback control solutions can be challenging, due to instrumentation and computation demands. Therefore, it is important to also develop heuristic control solutions in the quantum domain. In this paper we adapt one of the most widely used classical control heuristics, proportional-integral, referred to as PI feedback control [55], to the quantum domain. In the classical domain both P and PI feedback are subsets of proportional, integral, derivative (PID) control, which includes options for modulating the feedback signal with both integrals and derivatives of the measurement signal, in addition to simple multiples of this. In classical PID control, the feedback signal is proportional to the function

$\begin{equation}f\left(t\right)={\alpha }_{\mathrm{p}}e\left(t\right)+{\alpha }_{\mathrm{i}}{\int }_{0}^{\tau }\mathrm{d}{t}^{\prime }\enspace \chi \left({t}^{\prime }\right)e\left(t-{t}^{\prime }\right)+{\alpha }_{\mathrm{d}}\enspace \frac{\mathrm{d}}{\mathrm{d}t}\enspace e\left(t\right),\end{equation} \tag{ 1 }$

where e(t) is an error signal that is usually derived from the measurement at time t, and α_p, α_i, α_d are real coefficients that dictate the relative weights of the proportional, integral and derivative information, respectively, in forming the control law at any time. These weights are usually tuned empirically to achieve good control performance, since their optimal values cannot be computed a priori except for very simple systems. Intuitively, the integral portion is used to compensate for unused parts of the measurement signal at earlier times—integration can increase the signal to noise ratio, can decrease the amount of time it takes to reach the steady-state, and can decrease overshoot of the desired set point. The third component of equation (1), derivative control, can increase the stability of a result by suppressing slow deviations away from the desired target—here the derivative attempts to anticipate the direction of change in the error. While PID control is not known to be optimal in any general setting, it has proven to be a very useful framework for formulating heuristic control laws in practice [55].

In this paper we address the extension of the first two components of PID control to the quantum domain, formulating a quantum PI feedback law and analyzing the relative benefits of quantum PI, I, and P feedback in two canonical problems for quantum control, namely generation of entanglement between remote qubits using local Hamiltonians and non-local measurements, and state stabilization of the harmonic oscillator in an external environment. In contrast to some earlier studies of these systems [6, 39, 53], our feedback implementations for these problems do not require any state estimation and only rely on simple integrals of the measured signal. We allow for a time delay in the implementation of P feedback, as originally proposed by Wiseman [27]. A time delay between obtaining the measurement signal and implementing a feedback operation reflects common experimental constraints and is often regarded as being detrimental to proportional feedback [56, 57]. However we shall see that in the case of state stabilization of the harmonic oscillator, a time delay introduces additional flexibility of feedback that can be beneficial when the feedback control operations are restricted. We also examine the robustness of P feedback with respect to uncertainties in the time delay, in particular, to increases in the time delay beyond the ideal values for each protocol.

In general, our findings for these two classes of implementations show that adding an integral component to quantum feedback control can be useful in some but not all settings. This is different from the classical setting where adding an integral component to feedback control is almost universally beneficial [55]. The different behavior of quantum systems can be rationalized by recalling a key difference between quantum and classical settings, which is the unavoidable presence of stochastic measurement noise in quantum systems. In classical systems measurement noise can be minimized and even sometimes eliminated. However for quantum systems, any information gain from a measurement necessarily comes at the cost of added noise on the system. The proportional component of feedback can be very effective at minimizing the impact of this added noise. In special cases, including entanglement generation for two qubits with unit efficiency measurements [53] and the harmonic oscillator state stabilization with both position and momentum controls [6], P feedback can be used to cancel the measurement noise. However, when this is not possible or when there are additional noise sources, we find that I feedback, or a combination of I and P feedback, can be more effective than P feedback.

We note that rigorous analysis of a quantum version of full PID control within the input–output analysis of controlled quantum stochastic evolutions has been recently presented by Gough [58, 59]. In the current study of practical implementations, we do not investigate the full PID control in the quantum setting because the singular nature of the quantum measurement record makes the derivative terms ill-behaved and thus not useful for practical control implementations without further modifications. See references [31, 32] for interesting applications of derivative-based feedback.

The remainder of the paper is organized as follows. Section 2 introduces notation and presents the general equation for PI feedback. Section 3 discusses the control of entanglement of two remote qubits via a half-parity measurement and local feedback operations. Here we find that I feedback and PI feedback both show improved performance over P feedback alone. Section 4 investigates the control of state stabilization of a harmonic oscillator in a thermal environment, using feedback control on either both oscillator quadratures or a single quadrature. When control over both quadratures is possible, P feedback is found to perform better than pure I feedback control. When control over only the position quadrature is available, we find that time delay in P feedback can be beneficial, by allowing an approximation of the average momentum of the state that can be used to generate a good control law. However, despite this improvement of delayed P feedback over the direct, i.e., instantaneous, setting, a pure I feedback control strategy is nevertheless found to give better performance under the conditions of thermal damping. In both sections 3 and 4 we compare the results with prior work employing state estimation based feedback, and also analyze the robustness of the control law with respect to non-ideal time delay values. We close with a discussion and outlook for further work in section 5.

2. Formalism

In this section, we will develop the formalism for a quantum system under continuous-in-time measurement (e.g., homodyne detection) and PI feedback control. Figure 1 shows a block diagram of the feedback system that we aim to model. We define ρ to be the state of the system, H the intrinsic Hamiltonian, c the variable-strength measurement operator, and η the measurement efficiency. We will set ℏ = 1 throughout the paper.

**Figure 1.** Schematic block diagram of PI feedback control of a quantum system. First, the result of a continuous measurement on a quantum system is compared with a target value to form an error signal. This error signal is used to form two signals: (i) a scaled version obtained by multiplication by a real coefficient α_p ⩾ 0, and (ii) a smoothed version obtained by integrating over a time interval and multiplication by a real coefficient α_i ⩾ 0. These two signals are then additively combined and then used to condition actuation of the quantum system by an operator F.
Download figure:
Standard image High-resolution image

The dynamics of the system conditioned on the measurement record, but without feedback control, is described by the following Itô stochastic master equation (SME) [60]:

$\begin{equation}{\left[\mathrm{d}\rho \left(t\right)\right]}_{m}=-\mathrm{i}\left[H,\rho \left(t\right)\right]+\mathcal{D}\left[c\right]\rho \left(t\right)\mathrm{d}t+\sqrt{\eta }\mathcal{H}\left[c\right]\rho \left(t\right)\mathrm{d}W\left(t\right),\end{equation} \tag{ 2 }$

where dW(t) are Wiener increments (Gaussian-distributed random variables with mean zero and autocorrelation $\mathbb{E}\left\{\mathrm{d}W\left(s\right)\mathrm{d}W\left(t\right)\right\}=\delta \left(t-s\right)\mathrm{d}t$ ). The superoperators $\mathcal{D}$ and $\mathcal{H}$ in this equation are defined as $\mathcal{D}\left[A\right]\rho \equiv A\rho {A}^{{\dagger}}-\frac{1}{2}\left({A}^{{\dagger}}A\rho +\rho {A}^{{\dagger}}A\right)$ and $\mathcal{H}\left[A\right]\rho \equiv A\rho +\rho {A}^{{\dagger}}-\mathrm{Tr}\left[\left(A+{A}^{{\dagger}}\right)\rho \right]\rho$ . The corresponding measurement current can be written as [60]

$\begin{equation}j\left(t\right)=\langle c+{c}^{{\dagger}}\rangle \left(t\right)+\xi \left(t\right)/\sqrt{\eta },\end{equation} \tag{ 3 }$

where ξ(t) ≡ dW/dt is a white noise process. To emphasize the link between the measurement current and the conditional state evolution, the last term in equation (2) is sometimes written as $\eta \left(j\left(t\right)-\langle c+{c}^{{\dagger}}\rangle \left(t\right)\right)\mathcal{H}\left[c\right]\rho \left(t\right)\mathrm{d}t$ .

Before adding the feedback, we first define the error signal by analogy with classical PID control, as

$\begin{equation}e\left(t\right)=j\left(t\right)-g\left(t\right),\end{equation} \tag{ 4 }$

where g(t) is the setpoint or goal. This is often the desired value of the observable ⟨c + c^†⟩(t) but could also be another target function. g(t) is assumed to be a smoothly varying or constant function. Then the PI feedback operator in the quantum setting takes the form

$\begin{equation}\left[{\alpha }_{\mathrm{p}}\left(t\right)e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}}\left(t\right)\mathcal{J}\left(t\right)\right]F,\end{equation} \tag{ 5 }$

with some Hermitian operator F. Here α_p(t) and α_i(t) are time-dependent proportional and integral coefficients, respectively. This differs from classical PID control where the control coefficients are time-independent. Here, we will allow for time-dependence that is deterministic and independent of the measurement current, although in the following we will drop the time index on these coefficients for conciseness unless we wish to emphasize the time-dependence. We have also included the freedom of having a time delay τ_P > 0 in the proportional component. While this is often viewed as an experimental constraint on implementation of quantum feedback control protocols that is detrimental to performance [56, 57], we shall see below that for the harmonic oscillator state stabilization problem this can be used constructively to improve performance (subsection 4.2). $\mathcal{J}\left(t\right)$ is the integrated error signal,

$\begin{equation}\mathcal{J}\left(t\right)={\int }_{t-{\tau }_{\mathrm{I}}}^{t}\mathrm{d}s\enspace w\left(t,s\right)\left[j\left(s\right)-g\left(s\right)\right],\end{equation} \tag{ 6 }$

where w is a smooth integration kernel that can be used to vary the contribution of the measurement current at past times, and τ_I is the integration time. We shall assume the kernels are L² integrable and normalize them such that ${\int }_{0}^{{\tau }_{\mathrm{I}}}\mathrm{d}s\enspace w\left(t,s\right)=1$ . Time-homogeneous kernels just depend on the time separation, w(t, s) → w(t − s). Typically, w(t, s) decays with t − s and puts decreasing weight on measurement results from further in the past.

The action of this PI feedback only is captured by the following dynamics of the system density matrix ρ(t):

$\begin{equation}{\left[\dot {\rho }\left(t\right)\right]}_{\mathrm{f}\mathrm{b}}=\mathcal{K}\rho \equiv -\mathrm{i}\left[{\alpha }_{\mathrm{p}}e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\right]\left[F,\rho \left(t\right)\right].\end{equation} \tag{ 7 }$

We now combine equations (2) and (7) to derive the SME for evolution under measurements and the PI feedback, using the general formalism developed in reference [27] and its extension to smoothed feedback signals in references [27, 61]. For convenience we define the commutator superoperator F^× as F^×ρ ≡ [F, ρ]. The time-evolved state after an infinitesimal time dt is given by

$\begin{equation}\rho \left(t+\mathrm{d}t\right)={\mathrm{e}}^{\mathcal{K}\mathrm{d}t}\left\{1-\mathrm{i}{H}^{{\times}}\enspace \mathrm{d}t+\mathcal{D}\left[c\right]\mathrm{d}t+\sqrt{\eta }\mathcal{H}\left[c\right]\mathrm{d}W\left(t\right)\right\}\rho \left(t\right).\end{equation} \tag{ 8 }$

Note that this form ensures causality, since the feedback acts after the evolution due to measurement. The infinitesimal evolution equation is then obtained by expanding the exponential ${\mathrm{e}}^{\mathcal{K}\mathrm{d}t}$ in a Taylor series up to order dt. The first and second order terms in this expansion are:

$\begin{align}\hfill \mathcal{K}\mathrm{d}t& =-\mathrm{i}\left[{\alpha }_{\mathrm{p}}e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\right]\mathrm{d}t{F}^{{\times}}\hfill \\ \hfill & =-\mathrm{i}{\alpha }_{\mathrm{p}}\left[\langle c+{c}^{{\dagger}}\rangle \left(t-{\tau }_{\mathrm{P}}\right)\mathrm{d}t-\mathrm{d}W\left(t-{\tau }_{\mathrm{P}}\right)/\sqrt{\eta }-g\left(t-{\tau }_{\mathrm{P}}\right)\mathrm{d}t\right]{F}^{{\times}}-\mathrm{i}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\mathrm{d}t{F}^{{\times}}\hfill \end{align} \tag{ 9 }$

$\begin{align}\hfill {\mathcal{K}}^{2}\enspace \mathrm{d}{t}^{2}& =-{\left[{\alpha }_{\mathrm{p}}e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\right]}^{2}\mathrm{d}{t}^{2}\enspace {F}^{{\times}}{F}^{{\times}}\hfill \\ \hfill & =-\frac{{\alpha }_{\mathrm{p}}^{2}}{\eta }\enspace \mathrm{d}t\enspace {F}^{{\times}}{F}^{{\times}}+O\left(\mathrm{d}W\enspace \mathrm{d}t\right),\hfill \end{align} \tag{ 10 }$

where to write the second line in each equality we have expanded e(t) = j(t) − g(t), used the definitions j(t) (equation (3)), $\mathcal{J}\left(t\right)$ (equation (6)), and the Ito rule dW(s)dW(t) = δ(t − s)dt.

Therefore, discarding all terms less than order dt, the evolution for the system conditioned on the measurement and subsequently acted upon by the PI feedback control is

$\begin{align}\hfill \rho \left(t+\mathrm{d}t\right)=& \left\{1-\mathrm{i}{\alpha }_{\mathrm{p}}\left[\langle c+{c}^{{\dagger}}\rangle \left(t-{\tau }_{\mathrm{P}}\right)\mathrm{d}t+\frac{\mathrm{d}W\left(t-{\tau }_{\mathrm{P}}\right)}{\sqrt{\eta }}-g\left(t-{\tau }_{\mathrm{P}}\right)\mathrm{d}t\right]{F}^{{\times}}-\mathrm{i}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right){F}^{{\times}}\enspace \mathrm{d}t+\frac{{\alpha }_{\mathrm{p}}^{2}}{\eta }\mathcal{D}\left[F\right]\mathrm{d}t\right\}\hfill \\ \hfill & {\times}\left\{1-\mathrm{i}{H}^{{\times}}\enspace \mathrm{d}t+\mathcal{D}\left[c\right]\mathrm{d}t+\sqrt{\eta }\mathcal{H}\left[c\right]\mathrm{d}W\left(t\right)\right\}\rho \left(t\right).\hfill \end{align} \tag{ 11 }$

Multiplying this expression out and again discarding all terms smaller than O(dt), we find the following evolution for feedback with delay in the P component, τ_P > 0, is given by

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[H,\rho \left(t\right)\right]+\mathcal{D}\left[c\right]\rho \left(t\right)+\frac{{\alpha }_{\mathrm{p}}^{2}}{\eta }\mathcal{D}\left[F\right]\rho \left(t\right)-\mathrm{i}\left({\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)+{\alpha }_{\mathrm{p}}e\left(t-{\tau }_{\mathrm{P}}\right)\right)\left[F,\rho \right]\right\}\mathrm{d}t\hfill \\ \hfill & +\sqrt{\eta }\mathcal{H}\left[c\right]\rho \left(t\right)\mathrm{d}W\left(t\right).\hfill \end{align} \tag{ 12 }$

For the zero time delay case, we go back to equation (11), set τ_P = 0 and again multiply the expression out and discard terms smaller than O(dt) to get [26, 62]

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[H,\rho \left(t\right)\right]+\mathcal{D}\left[c\right]\rho \left(t\right)+\frac{{\alpha }_{\mathrm{p}}^{2}}{\eta }\mathcal{D}\left[F\right]\rho \left(t\right)-\mathrm{i}\left({\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)-{\alpha }_{\mathrm{p}}g\left(t\right)\right)\left[F,\rho \right]-\mathrm{i}{\alpha }_{\mathrm{p}}\left[F,c\rho \left(t\right)+\rho \left(t\right){c}^{{\dagger}}\right]\right\}\mathrm{d}t\hfill \\ \hfill & +\mathcal{H}\left[\sqrt{\eta }c-\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{\eta }}F\right]\rho \left(t\right)\mathrm{d}W\left(t\right).\hfill \end{align} \tag{ 13 }$

Note that in general it is not possible to obtain equation (13) by setting τ_P = 0 in equation (12). With zero time delay, the correlation between the feedback noise and measurement noise creates an order dt term (proportional to [F, cρ(t) + ρ(t)c^†]) that is not present in the presence of time delay.

The two SMEs in equations (12) and (13) represent the evolution of the quantum system conditioned on a continuous measurement record, together with PI feedback based on that record. Examining the terms proportional to α_i, it is evident that the integral feedback component just adds a generator of time-dependent unitary evolution to the system dynamics. This is in contrast to proportional feedback, which in addition to adding coherent evolution terms, also adds a dissipative evolution term and for τ_P = 0 also modifies the stochastic evolution term (the term proportional to dW in equation (11)). This reflects the difference that in proportional feedback, the delta-correlated noise is directly fed back at each time instant, whereas in integral feedback, the feedback action is conditioned on a smoothed, tempered signal and thus is able to generate a conventional (time-dependent) Hamiltonian term. Note that while the latter is not necessarily smoothly varying in time, its increments are O(dt). We emphasize that these SMEs model feedback that requires no state estimation (usually a computationally expensive task), and thus are more suitable for application to experimental implementations. However, P feedback with τ_P = 0 will always be an approximation since any measurement and feedback loop will have finite delay. The τ_P = 0 limit is a good approximation if the delay is small compared to the intrinsic system evolution time scales.

In this work, we simulate the above stochastic differential equations (SDE) describing evolution under PI feedback with a generalized Euler–Maruyama method. In the usual Euler–Maruyama method [63], one generates a Wiener noise increment dW(t) for each time step [t, t + dt] and then updates the state according to the stochastic differential equation. In our generalized Euler–Maruyama method, for each time t we keep a record of the noise up to time τ = max(τ_I, τ_P) in the past, i.e., dW(t), dW(t − dt), ...dW(t − τ)). Then dW(t − τ_P) is accessible and $\mathcal{J}\left(t\right)$ can be calculated at each time t. The state is then updated according to the SME equation (11) as usual. We normalize the density matrix at each time step to compensate for numerical round-off errors.

3. Two-qubit entanglement generation

In this section, we compare the performance of P feedback, I feedback, and PI feedback for the task of generating an entangled two-qubit state with a local Hamiltonian and non-local measurement. This non-trivial state generation task was first addressed by measurement-based control with post-selection [64–66], then by P feedback and discrete feedback [39, 53] and most recently by PAQS control [51]. For perfect measurement efficiency η = 1, the proportional feedback strategy with time-dependent α_p(t), was shown in reference [53] to be globally optimal among all protocols that have constant measurement rate. In this case, the measurement noise can be exactly canceled and the evolution converges deterministically to the target state. In the following, we consider the case where the measurement efficiency is not unity and the simplified setting where the feedback coefficients α_p and α_i are assumed to be time-independent. In this experimentally relevant setting, P feedback is not known to be globally optimal. Furthermore, the two-qubit system under measurement and feedback is not linear and therefore is representative of a more general class of quantum systems, in contrast to the linear setting of harmonic oscillator stabilization treated in the next section. This non-linearity makes analytical arguments for optimal feedback laws difficult and therefore we must resort to a numerical study. However, we ask the question: whether its advantageous to combine P and I feedback?

Consider two qubits subject to an intrinsic Hamiltonian H = h₁σ_z1 + h₂σ_z2 and subject to negligible decoherence. In the following we will assume h₁ = h₂ = h. We measure the half-parity of the qubits [65], which allows a non-local implementation between remote qubits [64]. The relevant measurement operator c is

$\begin{equation}c=\sqrt{k}{L}_{z}=\frac{\sqrt{k}}{2}\left({\sigma }_{z1}+{\sigma }_{z2}\right),\end{equation} \tag{ 14 }$

where k is the measurement strength and the associated measurement current is

$\begin{equation}j\left(t\right)=2\langle {L}_{z}\rangle +\xi \left(t\right)/\sqrt{k\eta }.\end{equation} \tag{ 15 }$

The control goal is to stabilize the system in an entangled state, when starting from a simple product state, |↑⟩ ⊗ |↑⟩ ≡ |↑↑⟩. Given the exchange symmetry of the intrinsic Hamiltonian and the measurement operator (we will be careful to also maintain this symmetry with the feedback operator below), and since the initial state is exchange symmetric, we will remain in the symmetric triplet subspace of two qubits throughout the evolution. This subspace is spanned by the states $\vert {T}_{-1}\rangle =\left\vert {\downarrow}{\downarrow}\right\rangle$ , $\vert {T}_{0}\rangle =\frac{1}{2}\left(\left\vert {\downarrow}{\uparrow}\right\rangle +\left\vert {\uparrow}{\downarrow}\right\rangle \right)$ , and $\vert {T}_{1}\rangle =\left\vert {\uparrow}{\uparrow}\right\rangle$ . Our goal is to evolve to, and stabilize the system in, the entangled state |T₀⟩. As in reference [39] we use the intuition of rotating the system in the symmetric subspace and choose a local feedback operator $F={L}_{x}=\frac{1}{2}\left({\sigma }_{x1}+{\sigma }_{x2}\right)$ . Applying a L_x rotation can bring |T_±1⟩ closer to |T₀⟩.

Since the control goal in this case is to prepare the state |T₀⟩, and the deterministic part of the measurement under this state, ⟨T₀|L_z|T₀⟩ is zero, we may set the goal to be g(t) = ⟨T₀|L_z|T₀⟩ = 0 ∀ t. Hence our error signal is e(t) = j(t). Thus, we obtain the SME that describes the evolution of the two-qubit system for both τ_P = 0 (equation (A1)) and τ_P > 0 (equation (A2)) as shown in appendix A. We employ an exponential filter for the integral feedback:

$\begin{equation}\mathcal{J}\left(t\right)=\frac{1}{{\tau }_{\mathrm{I}}}{\int }_{t-{\tau }_{\mathrm{I}}}^{t}j\left(s\right)\mathrm{exp}\left(-\frac{\left(t-s\right)}{{\tau }_{\mathrm{I}}}\right)\mathrm{d}s.\end{equation} \tag{ 16 }$

To assess the relative performance of the feedback strategies, we will look at the steady state average populations of the three triplet states as well as the average concurrence measure of entanglement. Given a two-qubit density operator ρ, the populations of the triplet states are given by ${T}_{i}=\left\langle {T}_{i}\vert \rho \vert {T}_{i}\right\rangle ,\enspace i=-1,\enspace 0,\enspace 1$ , and the concurrence is defined as [67, 68]

$\begin{equation}\mathcal{C}\left(\rho \right)\equiv \mathrm{max}\left(0,{\lambda }_{1}-{\lambda }_{2}-{\lambda }_{3}-{\lambda }_{4}\right),\end{equation} \tag{ 17 }$

where λ₁,..., λ₄ are the (non-negative) eigenvalues, in decreasing order, of the Hermitian matrix $R=\sqrt{\sqrt{\rho }\tilde {\rho }\sqrt{\rho }}$ with $\tilde {\rho }=\left({\sigma }_{y}\otimes {\sigma }_{y}\right){\rho }^{{\ast}}\left({\sigma }_{y}\otimes {\sigma }_{y}\right),$ the spin flipped state of ρ.

Figure 2 shows these measures of the average evolution of the two-qubit system for the initial state |T₁⟩, under the strategies of P feedback [α_p = 1, α_i = 0, panels (a) and (b)], I feedback [α_p = 0, α_i = 1, panel (c)], and PI feedback with a specific combination of α_p and α_i [panel (d)]. The parameters of the system are h = 0.1, k = 1, η = 0.4. The choice of k sets the units for the other rates in the model, η was chosen to be consistent with current experimental capabilities [64], and we vary h later to see its effect on the conclusions drawn. The results in this section are for the |T₁⟩ initial state. We have also simulated the protocols and their steady states starting from any mixture of product states in the triplet manifold (the initial states simplest to prepare in experiments) and the results are similar to those shown here for the |T₁⟩ initial state.

Figures 2(a) and (b) show the evolution under P feedback, with and without a time delay. We expect that there is little benefit in introducing a time delay in proportional feedback in this example, since there is no information in prior measurement currents that is germane to the control goal. Indeed this expectation is borne out by these figures; the performance of the time-delayed feedback is worse than without a time delay, τ_P = 0. Figure 2(c) shows the performance under I feedback. The value of the integration time τ_I can be numerically optimized to yield maximum concurrence. The plot in figure 2(c) uses τ_I = 3, which is a near-optimal value for concurrence.

Comparing figure 2(c) with figures 2(a) and (b), it is evident that in the case of inefficient measurements, η < 1, an I feedback strategy is able to produce a significantly higher steady state average concurrence and target T₀ population than a P feedback strategy. Finally, in figure 2(d) we show the average behavior for a specific combination of P and I feedback, i.e., of PI feedback, with α_p = 0.03 and α_i = 0.17. This combined PI feedback strategy performs slightly better than the pure I feedback strategy, thus outperforming both P and I strategies (the long time value of concurrence in figure 2(c) is ∼0.7196 ± 0.0028 and in figure 2(d) is ∼0.7289 ± 0.0028). We have plotted here the results of just one choice of α_p and α_i that combines P and I feedback. This particular choice was made to show that PI feedback can outperform P and I feedback based on a more general analysis of mixing the two types of feedback that we will detail below. Note that the total feedback strength has been kept constant across all the settings shown in figure 2, specifically at α_i + α_p = 0.2, in order to have a fair comparison. We also emphasize that these plots show average values of the state populations and concurrence, where the averages are computed over 8000 evolution trajectories. For efficiency η = 0.4, since none of these protocols achieves cancellation of the measurement noise, the individual trajectories of triplet state populations and concurrence show fluctuations for all four feedback strategies.

Analysis of single trajectories reveals insight into the better performance of the I feedback strategy relative to the P feedback strategies. Representative trajectories of the triplet state populations under I feedback and P feedback with zero time delay are shown in figure 3. In general, the evolution under both feedback strategies drives the system towards the |T₀⟩ state. The population T₀ can reach the value 1 and remain there for some time period until a measurement noise fluctuation entering through the feedback term is large enough to drive it down. Under P feedback, we are conditioning feedback on the raw measurement and thus the T₀ population fluctuations can be large, which results in more frequent transitions out of the target state |T₀⟩. In contrast, the integral component in the I feedback strategy smooths out the measurement current fluctuations, which reduces the probability of the feedback term kicking the system out of the target |T₀⟩ state. Consequently, as we analyze in detail below the ensemble average of the triplet population, $\mathbb{E}\left[{T}_{0}\right]$ , will be larger for the integral control strategy than for the proportional control strategy.

**Figure 3.** Single trajectories of triplet state populations for proportional feedback (τ_P = 0) and integral feedback (τ_I = 3). The measurement efficiency η = 0.4 and initial state is an unentangled state T₁.
Download figure:
Standard image High-resolution image

To understand this in more quantitative terms, we have given the evolution of these triplet populations and the associated off-diagonal elements of the density matrix in the triplet subspace under general PI feedback in appendix A. For the case of I feedback, i.e., α_p = 0, α_i > 0, the evolution is:

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{-1}& =\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\text{Im}\enspace {T}_{0,-1}\enspace \mathrm{d}t-2\sqrt{\eta k}\left(1+\langle {L}_{z}\rangle \left(t\right)\right){T}_{-1}\enspace \mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}{T}_{0}& =-\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\left(\text{Im}\enspace {T}_{0,1}+\text{Im}\enspace {T}_{0,-1}\right)\left.\right]\mathrm{d}t-2\sqrt{\eta k}\langle {L}_{z}\rangle \left(t\right){T}_{0}\enspace \mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}{T}_{1}& =\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\text{Im}\enspace {T}_{0,1}\enspace \mathrm{d}t+2\sqrt{\eta k}\left(1-\langle {L}_{z}\rangle \left(t\right)\right){T}_{1}\enspace \mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}{T}_{1,-1}& =\left\{2\left[\mathrm{i}2h-k\right]{T}_{1,-1}+\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{1,0}-{T}_{0,-1}\right)\right\}\enspace \mathrm{d}t-2\sqrt{\eta k}\langle {L}_{z}\rangle \left(t\right){T}_{1,-1}\enspace \mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}{T}_{0,1}& =\left\{-\left[\mathrm{i}2h+\frac{k}{2}\right]{T}_{0,1}-\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{-1}-{T}_{0}+{T}_{-1,1}\right)\right\}\enspace \mathrm{d}t+\sqrt{\eta k}\left(1-2\langle {L}_{z}\rangle \left(t\right)\right){T}_{0,1}\enspace \mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}{T}_{0,-1}& =\left\{\left[\mathrm{i}2h-\frac{k}{2}\right]{T}_{0,-1}-\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{0}-{T}_{-1}-{T}_{1,-1}\right)\right\}\enspace \mathrm{d}t-\sqrt{\eta k}\left(1+2\langle {L}_{z}\rangle \left(t\right)\right){T}_{0,-1}\enspace \mathrm{d}W\left(t\right).\hfill \end{aligned}\end{equation} \tag{ 18 }$

We suppress the time index of T_i and T_i,j here for notational conciseness. We cannot take the ensemble average (to obtain the average evolution) by simply dropping the stochastic terms in this case, because $\mathcal{J}\left(t\right)$ and T_i and T_i,j are correlated by virtue of the dependence of both on past Wiener increments. Moreover, due to their nonlinearity we cannot solve these equations directly. However, we can use the following argument to show that equation (18) has a (unstable) steady state when T₀ = 1. Suppose at some time, T₀ reaches 1 and we have T₁ = T₋₁ = 0 (and thus ⟨L_z⟩ = 0). Then the coherences T_1,−1, T_0,1, T_0,−1 will be approximately zero also (since all populations other than T₀ are zero). As a result, in the above equations dT₋₁ = dT₀ = dT₁ = dT_1,−1 ≈ 0, and the only coherences that evolve are given by $\mathrm{d}{T}_{0,1}=-\mathrm{d}{T}_{0,-1}=\mathrm{i}\frac{{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\mathrm{d}t$ . These coherences are generated by a non-zero $\mathcal{J}\left(t\right)$ , and then go on to generate non-zero populations in the undesired states T₋₁ and T₁. This perturbation away from the desired state is weak because of two factors: (i) $\mathcal{J}\left(t\right)$ can be made small when T₀ = 1, since the deterministic position of j(t) is zero, and the averaging integral will dampen the fluctuations dW(t) over the period τ_I, and (ii) the coherences are dampened at a rate k/2, and therefore even when coherences are generated by non-zero $\mathcal{J}\left(t\right)$ , they can be quickly dampened by the measurement induced dephasing before they generate non-zero populations in the undesired states.

It is clear that the integration time τ_I is an important parameter for the integral control strategy. Optimization of this parameter involves a tradeoff between smoothing and time delay in the feedback action as τ_I increases. Specifically, we can expect that a longer integration time τ_I will improve the concurrence, due to the reduced fluctuations, but because the signal is being averaged over a longer time window, it will take longer for deviations away from the target value to affect the averaged value, resulting in a time delay in the feedback action. To illustrate the resulting trade-off between short and long integration time choices, figure 4(a) plots the steady state average concurrence as a function of the filter integration time τ_I for I feedback. Note that the τ_I = 0 reference value refers to the proportional feedback strategy with no delay. The generic behavior shown here is found for any value of the feedback strength α_i, i.e., for all α_i values we see that the concurrence shows a maximum value at a non-zero optimal filter integration time. This optimal value of τ_I decreases as the control parameter α_i increases (not shown). We also find that the system takes increasingly longer times to reach steady state as the feedback strength α_i goes to zero, or as τ_I gets larger.

**Figure 4.** Steady state average concurrence dependence on integration time and relative weight of P and I control in PI control, measured by the mixing ratio θ, equation (19). For all calculations shown here, the initial state is taken to be the unentangled state T₁ and all results are averaged over 8000 trajectories. (a) Steady state average concurrence vs integration time τ_I for pure integral (I) control, for intrinsic Hamiltonian parameters h₁ = h₂ = h = 0.1, measurement efficiency η = 0.4 and two different integral feedback coefficient values α_i. (b) Steady state average concurrence vs mixing ratio θ for intrinsic Hamiltonian parameters h₁ = h₂ = h = 0.1, measurement efficiency η = 0.4 and PI feedback with different values of total feedback strength f_PI. For f_PI = 0.3 (blue line), the integration time for the integral component is τ_I = 1, for f_PI = 0.2, (red line) τ_I = 3. (c) Steady state average concurrence vs the mixing ratio θ for intrinsic Hamiltonian parameters h₁ = h₂ = h = 0.1, and total feedback strength f_PI = 0.2 with integration time τ_I = 3, under various values of measurement efficiency. (d) Steady state average concurrence vs. mixing ratio θ for intrinsic Hamiltonian parameters h₁ = h₂ = h = 0.5, and measurement efficiency η = 0.4 under PI feedback. For f_PI = 0.3 (blue line) the integration time parameter is τ_I = 1 and for f_PI = 0.2 (red line), τ_I = 3. The key difference with panel (b) is that here h = 0.5, and in this case I feedback is superior to any mixture of P and I feedback.
Download figure:
Standard image High-resolution image

Finally, we explore in more detail the possibility of full PI feedback, i.e., combining proportional and integral feedback for the problem of entangled state generation with inefficient measurements in this two qubit system. In figure 2(d) we already showed that there was a small benefit to combining both strategies for a particular set of coefficients. To study the performance of the combined strategy more systematically, we write

$\begin{equation}{\alpha }_{\mathrm{p}}=\left(1-\theta \right){f}_{\mathrm{P}\mathrm{I}},\quad {\alpha }_{\mathrm{i}}=\theta {f}_{\mathrm{P}\mathrm{I}}\end{equation} \tag{ 19 }$

where f_PI is the total feedback strength and θ ∈ [0, 1] is a mixing ratio quantifying the combination of the two strategies. In figure 4(b) we now plot the steady state average concurrence versus this strategy mixing ratio θ for PI feedback, while keeping the total feedback strength f_PI constant. The plot shows the existence of an optimal mixing ratio θ_o located between ∼0.7 and ∼0.9, i.e., the optimal strategy is to have mostly integral control with some admixture of proportional control. The precise value of this optimal mixing ratio depends on the total feedback strength f_PI. However, as shown in figure 4(c), θ_o is quite robust to variations in efficiency. Note that the maximum concurrence obtained by this PI feedback strategy for perfect efficiency, η = 1, is less than that obtained using the globally optimal P feedback strategy with time-dependent proportionality constant α_p(t) [39, 53].

These results show that the advantage of PI control relative to pure I or pure P control increases as the total feedback strength parameter f_PI increases. This can be seen by comparing the difference in steady state average concurrence between P, I and PI with optimal θ_o for f_PI = 0.2 (red line) and f_PI = 0.3 (blue line) in figure 4(b). Finally, we note that the optimal mixing ratio also depends on the system Hamiltonian, in particular, the value of h. In this case, for larger values of h, the optimal mixing parameter θ_o → 1 and the optimal feedback strategy becomes just I feedback. We show the concurrence versus θ curves for h = 0.5 in figure 4(d), for comparison with figure 4(b).

4. Harmonic oscillator state stabilization

State stabilization of a quantum harmonic oscillator is a canonical quantum feedback control problem that has been studied for several decades [6, 10, 29, 52, 69]. This problem has many practical applications, including the cooling and manipulation of trapped cold ions [70] or atoms [71], and cooling of nanoscale [72] or even macroscopic [73, 74] mechanical systems. Purely proportional feedback control schemes have been developed for this problem [6, 29, 52, 69]. In the following, we investigate whether adding integral control adds any benefit in terms of control accuracy.

The system is a quantum harmonic oscillator with mass m and angular frequency ω. We apply a continuous measurement of the oscillator position x with strength k (i.e., $c=\sqrt{k}x$ in the notation of the section 2) and efficiency η. The SME describing the system under measurement is [52]

$\begin{equation}\mathrm{d}\rho =-\mathrm{i}\left[{H}_{0},\rho \right]\mathrm{d}t+2\gamma \left(N+1\right)\mathcal{D}\left[a\right]\rho \enspace \mathrm{d}t+2\gamma N\mathcal{D}\left[{a}^{{\dagger}}\right]\rho \enspace \mathrm{d}t+k\mathcal{D}\left[x\right]\rho \enspace \mathrm{d}t+\sqrt{\eta k}\mathcal{H}\left[x\right]\rho \enspace \mathrm{d}W,\end{equation} \tag{ 20 }$

where H₀ = p²/(2m) + mω²x²/2, p is the oscillator momentum operator and a is the annihilation operator. The terms proportional to γ describe damping and excitation due to coupling to a bosonic thermal bath with mean occupation N. The associated measurement signal is

$\begin{equation}j\left(t\right)=2\langle x\rangle \left(t\right)+\xi \left(t\right)/\sqrt{k\eta }.\end{equation} \tag{ 21 }$

We shall consider two types of feedback for this system. First, we consider linear feedback in both x and p, in which case we have two feedback operators:

$\begin{equation}{F}_{1}=x,\quad {F}_{2}=p.\end{equation} \tag{ 22 }$

We will attach (time-dependent) proportional coefficients (α_p1, α_p2) and integral coefficients (α_i1, α_i2) to each of these feedback operators. The total feedback operator is then

$\begin{equation*}\left[{\alpha }_{\mathrm{p}\mathrm{1}}\left(t\right)e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}\mathrm{1}}\left(t\right)\mathcal{J}\left(t\right)\right]x+\left[{\alpha }_{\mathrm{p}\mathrm{2}}\left(t\right)e\left(t-{\tau }_{\mathrm{P}}\right)+{\alpha }_{\mathrm{i}\mathrm{2}}\left(t\right)\mathcal{J}\left(t\right)\right]p.\end{equation*}$

Applying F₁ is usually considerably easier than F₂, since the former corresponds to applying a force on the oscillator. Therefore, we will also consider the setting where only F₁ is available, in which case we have only the coefficients α_p1, α_i1. Given the simplicity of the harmonic system, it is possible to set up analytic candidate control laws that are specified in terms of choices for the coefficients α_p1, α_p2, α_i1, α_i2, and to then assess whether they are consistent with P, I or PI feedback. The SMEs of harmonic oscillator for both cases (τ_P > 0 and τ_P = 0) are given by equations (B1) and (B2) in appendix B.

In the simplest setting where the system starts in a Gaussian state, the state remains Gaussian when evolved according to the above measurement and feedback dynamics since all operators acting on the density matrix are linear or quadratic in x, p [6, 52]. A Gaussian state is completely determined by its first moments ( $\left\langle x\right\rangle ,\left\langle p\right\rangle$ ) and second moments ( ${V}_{x}\equiv \left\langle {\left(x-\left\langle x\right\rangle \right)}^{2}\right\rangle$ , ${V}_{p}\equiv \left\langle {\left(p-\left\langle p\right\rangle \right)}^{2}\right\rangle$ , ${C}_{xp}\equiv \frac{1}{2}\left\langle xp+px\right\rangle -\left\langle x\right\rangle \left\langle p\right\rangle$ ). The evolution of the second moments under the above measurement, thermal damping, and feedback is independent of the feedback, and evolve deterministically, independent of the measurement noise, ξ(t) [6]. The equations of motion for the second moments are given by equation (C1) in appendix C. We will assume in the following that these equations are solved in advance and therefore that V_x(t), V_p(t) and C_xp(t) are known functions of time. In all of the examples treated in this section, we shall take the initial state to be a coherent state with V_x(0) = V_p(0) = 0.5 and C_xp(0) = 0.

Under the measurement and feedback dynamics described in equations (B1) and (B2), the evolution of the first moments are given by tr[x dρ(t)] and tr[p dρ(t)]:

$\begin{align}\hfill \mathrm{d}\langle x\rangle \left(t\right)& =\frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \langle x\rangle \left(t\right)\mathrm{d}t+{\alpha }_{\mathrm{i}\mathrm{2}}\mathcal{J}\left(t\right)\mathrm{d}t+{\alpha }_{\mathrm{p}\mathrm{2}}\left(2\langle x\rangle \left(t-{\tau }_{\mathrm{P}}\right)-g\left(t-{\tau }_{\mathrm{P}}\right)\right)\mathrm{d}t\hfill \\ \hfill & \quad +\sqrt{\eta k}\left(2{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)+\frac{{\alpha }_{\mathrm{p}\mathrm{2}}}{k\eta }\enspace \mathrm{d}W\left(t-{\tau }_{\mathrm{P}}\right)\right),\hfill \end{align} \tag{ 23a }$

$\begin{align}\hfill \qquad \mathrm{d}\langle p\rangle \left(t\right)& =-m{\omega }^{2}\langle x\rangle \left(t\right)\mathrm{d}t-\gamma \langle p\rangle \left(t\right)\mathrm{d}t-{\alpha }_{\mathrm{i}\mathrm{1}}\mathcal{J}\left(t\right)\mathrm{d}t-{\alpha }_{\mathrm{p}\mathrm{1}}\left(2\langle x\rangle \left(t-{\tau }_{\mathrm{P}}\right)-g\left(t-{\tau }_{\mathrm{P}}\right)\right)\mathrm{d}t\hfill \\ \hfill & \quad +\sqrt{\eta k}\left(2{C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-\frac{{\alpha }_{\mathrm{p}\mathrm{1}}}{k\eta }\enspace \mathrm{d}W\left(t-{\tau }_{\mathrm{P}}\right)\right),\hfill \end{align} \tag{ 23b }$

where τ_P ⩾ 0. In the limit of zero time delay, the equations of motion for the first moments are the same as above, with τ_P = 0 (this reduction for the evolution of the first moments of the quadratures is a special case since as noted above, taking τ_P = 0 in equation (12) does not yield equation (13)).

Our overall control goal is state stabilization, where the aim is to center the state at an arbitrary stationary (time-independent) value of the two quadrature means in the rotating frame of the oscillator, notated (X_g, P_g). In the laboratory frame this control goal is specified by the mean quadrature values (x_g(t), p_g(t)), which are related to (X_g, P_g) by the transformation

$\begin{align}\hfill {x}_{g}\left(t\right)& ={X}_{g}\enspace \mathrm{cos}\left(\omega t\right)+{P}_{g}\enspace \mathrm{sin}\left(\omega t\right)/\left(m\omega \right),\hfill \end{align} \tag{ 24a }$

$\begin{align}\hfill {p}_{g}\left(t\right)& =-m\omega {X}_{g}\enspace \mathrm{sin}\left(\omega t\right)+{P}_{g}\enspace \mathrm{cos}\left(\omega t\right).\hfill \end{align} \tag{ 24b }$

We note that the oscillator cooling problem [6] can be viewed as a special case of this state stabilization with the control goal (X_g = 0, P_g = 0).

The evolution of the first order moments in the rotating frame is given by

$\begin{align}\hfill \mathrm{d}\langle X\rangle \left(t\right)=& \left(\mathrm{d}\langle x\rangle \left(t\right)-\frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t\right)\mathrm{cos}\left(\omega t\right)-\left(m{\omega }^{2}\langle x\rangle \left(t\right)+\mathrm{d}\langle p\rangle \left(t\right)\right)\mathrm{sin}\left(\omega t\right)/m\omega \hfill \end{align} \tag{ 25a }$

$\begin{align}\hfill \mathrm{d}\langle P\rangle \left(t\right)& =m\omega \left(\mathrm{d}\langle x\rangle \left(t\right)-\frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t\right)\mathrm{sin}\left(\omega t\right)+\left(m{\omega }^{2}\langle x\rangle \left(t\right)+\mathrm{d}\langle p\rangle \left(t\right)\right)\mathrm{cos}\left(\omega t\right).\hfill \end{align} \tag{ 25b }$

For later convenience we define the deviations from the target mean values in the rotating frame by $\tilde {X}\left(t\right)=\langle X\rangle \left(t\right)-{X}_{g}$ and $\tilde {P}\left(t\right)=\langle P\rangle \left(t\right)-{P}_{g}$ and put these deviations together in a vector $Z\left(t\right)\equiv {\left[\tilde {X}\left(t\right),\tilde {P}\left(t\right)\right]}^{\mathsf{T}}$ .

We must choose an error signal, e(t), that is based on this control goal and the measurement signal that we have access to. According to the description above, there are two components to the target state in this problem, one for each quadrature of the oscillator, i.e., X_g and P_g. However, since our measurements are made in the laboratory frame and we measure only the x-quadrature, from now on we shall specify the goal function to be g(t) = 2x_g(t), so that the error signal is then e(t) = j(t) − 2x_g(t).

Finally, we note that in this work we shall restrict ourselves to the regime of weak measurement and damping k, γ ≪ mω², where the measurement extracts some information about the system at each timestep but does not completely distort the harmonic evolution. Similarly, the system is under-damped by the thermal bath. In this limit, it is valid to still define the characteristic period of the oscillator as T = 2π/ω.

In the following subsections we consider first the case of x measurement with feedback controls in both x and p (section 4.1) and then the case of x measurement with feedback control only in x (section 4.2).

4.1. x and p control

We now analyze the case of x measurement with feedback controls in both x and p.

4.1.1. Proportional feedback

We first consider proportional feedback only, i.e., α_i1 = α_i2 = 0 in equation (23). We shall show that the quadrature expectations of any state can be driven to the target values (X_g, P_g) by setting α_p1(t) = 2kηC_xp(t), α_p2(t) = −2kηV_x(t) and τ_P = 0. However, in order to compensate for the thermal damping, we also need to add a term γ(x_g(t)p + p_g(t)x) to the Hamiltonian H₀ (note that this is not a feedback term, since it is not dependent on the measurement record). The evolution of the first moments of the oscillator with these settings is given in equations (C2) in appendix C.2, and when these equations are transformed into the rotating frame, the deviations $\tilde {X}$ and $\tilde {P}$ evolve as:

$\begin{align}\hfill \mathrm{d}\tilde {X}& =-\gamma \tilde {X}\enspace \mathrm{d}t-4k\eta \left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\left[\tilde {X}\enspace \mathrm{cos}\left(\omega t\right)+\tilde {P}\enspace \mathrm{sin}\left(\omega t\right)/m\omega \right]\mathrm{d}t,\hfill \end{align} \tag{ 26a }$

$\begin{align}\hfill \mathrm{d}\tilde {P}& =-\gamma \tilde {P}\enspace \mathrm{d}t-4k\eta \left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\left[\tilde {X}\enspace \mathrm{cos}\left(\omega t\right)+\tilde {P}\enspace \mathrm{sin}\left(\omega t\right)/m\omega \right]\mathrm{d}t.\hfill \end{align} \tag{ 26b }$

We now see that our choice of proportional feedback coefficients α_p1(t) and α_p2(t) has allowed the feedback to completely cancel all measurement noise contributions (captured by the dW terms), resulting in deterministic equations for the evolution of the mean values ⟨x⟩(t) and ⟨p⟩(t). The fact that such cancellation is possible was already noted in the early studies of feedback cooling of quantum oscillators [6]. In addition, as we shall prove explicitly below, these coefficients make use of the thermal and measurement induced dissipation to steer the system to the target quadrature mean values.

Figure 5(a) shows the evolution of the mean values of the quadratures in the rotating frame under this control law for an arbitrary initial state (specified in the caption). The evolution behavior suggests that this proportional control law yields exponential convergence to the goal quadrature values. To understand why this particular control law works and to prove the exponential nature of the convergence to the target state, we begin by noting that the coefficients in the system of differential equations in equation (26) display fast oscillations through the cos(ωt) and sin(ωt) terms, while the changes in the other time-dependent terms, V_x(t), V_p(t) and C_xp(t) are small over the timescale of these oscillations. Therefore we may approximate this evolution by another system with new coefficients defined by time-averaging the coefficients in equation (26) over one oscillator period T, and treating all time-varying quantities other than cos(ωt) and sin(ωt) as constants. For example, ${V}_{x}\left(t\right){\mathrm{cos}}^{2}\left(\omega t\right)\approx \frac{{V}_{x}\left(t\right)}{2}$ since $\frac{1}{T}{\int }_{0}^{T}\mathrm{d}t\enspace {\mathrm{cos}}^{2}\left(\omega t\right)=\frac{1}{2}$ , and C_xp(t)sin(ωt)cos(ωt) ≈ 0 since $\frac{1}{T}{\int }_{0}^{T}\mathrm{d}t\enspace \mathrm{cos}\left(\omega t\right)\mathrm{sin}\left(\omega t\right)=0$ . We refer to this approximation as period-averaging, but note that it is equivalent to the rotating wave approximation, since it amounts to dropping fast rotating terms in the evolution operator in the rotating frame. In appendix D we show that this is a very good approximation in the regime k, γ ≪ mω. The period-averaged dynamics for the above system, written in matrix form is $\dot {Z}\left(t\right)\approx A\left(t\right)Z\left(t\right)$ (recall that $Z\left(t\right)={\left[\tilde {X}\left(t\right),\tilde {P}\left(t\right)\right]}^{\mathsf{T}}$ ), with

$\begin{equation}A\left(t\right)=\left[\begin{matrix}\hfill -\gamma -2k\eta {V}_{x}\left(t\right)\hfill & \hfill \frac{2k\eta }{{\left(m\omega \right)}^{2}}{C}_{xp}\left(t\right)\hfill \\ \hfill -2k\eta {C}_{xp}\left(t\right)\hfill & \hfill -\gamma -2k\eta {V}_{x}\left(t\right)\hfill \end{matrix}\right].\end{equation} \tag{ 27 }$

The deviation from the target mean values at time t is given by $Z\left(t\right)=\mathrm{exp}\left({\int }_{0}^{t}A\left(\tau \right)\mathrm{d}\tau \right)Z\left(0\right)$ . The matrix A(t) has eigenvalues $-\gamma -2k\eta {V}_{x}\left(t\right){\pm}\mathrm{i}\frac{2k\eta {C}_{xp}\left(t\right)}{m\omega }$ , for which the real parts are negative for all t. Hence, this is a stable system that converges exponentially towards the $\tilde {Z}=0$ fixed point. We may view the $\tilde {Z}\left(t\right)$ as a vector Lyapunov function guaranteeing the stability of the final state [75]. This shows that for this choice of proportional feedback parameters one can completely cancel the measurement noise and obtain a deterministic system that exponentially stabilizes an arbitrary initial state.

**Figure 5.** Evolution of expectation values of the quadratures of an oscillator in the rotating frame, subject to continuous measurement, x and p feedback control and thermal damping. The parameters of the system are as follows: m = ω = N = 1, γ = k = mω²/(50), η = 0.4. The initial state is set to ⟨X⟩ = 10, ⟨P⟩ = 10mω and the target values are set to X_g = 6, P_g = 4mω (marked by dotted lines in both subfigures). For these simulations we used dt = T/250 = 0.0251. (a) Proportional feedback. The equations for ⟨X⟩, ⟨P/mω⟩ are deterministic (equation (26)) and converge exponentially to the target values. (b) Integral feedback. The characteristic time τ_I for the exponential filter is set to 0.15T. The red and blue solid lines show the evolution of the expectations ⟨X⟩, ⟨P/mω⟩, for one trajectory. This evolution is now subject to measurement noise and is not deterministic (equation (29)). The green and purple lines show the behavior of the ensemble average over 1000 trajectories, ${X}_{a}\left(t\right)=\mathbb{E}\langle X\rangle \left(t\right)$ and ${P}_{a}/m\omega =\mathbb{E}\langle P/m\omega \rangle \left(t\right)$ . The maximum standard deviation of the trajectories ⟨X⟩(t) and ⟨P⟩(t) increases with τ_I, saturating at 0.7610 at long times.
Download figure:
Standard image High-resolution image

**Figure 5.** Evolution of expectation values of the quadratures of an oscillator in the rotating frame, subject to continuous measurement, x and p feedback control and thermal damping. The parameters of the system are as follows: m = ω = N = 1, γ = k = mω²/(50), η = 0.4. The initial state is set to ⟨X⟩ = 10, ⟨P⟩ = 10mω and the target values are set to X_g = 6, P_g = 4mω (marked by dotted lines in both subfigures). For these simulations we used dt = T/250 = 0.0251. (a) Proportional feedback. The equations for ⟨X⟩, ⟨P/mω⟩ are deterministic (equation (26)) and converge exponentially to the target values. (b) Integral feedback. The characteristic time τ_I for the exponential filter is set to 0.15T. The red and blue solid lines show the evolution of the expectations ⟨X⟩, ⟨P/mω⟩, for one trajectory. This evolution is now subject to measurement noise and is not deterministic (equation (29)). The green and purple lines show the behavior of the ensemble average over 1000 trajectories, ${X}_{a}\left(t\right)=\mathbb{E}\langle X\rangle \left(t\right)$ and ${P}_{a}/m\omega =\mathbb{E}\langle P/m\omega \rangle \left(t\right)$ . The maximum standard deviation of the trajectories ⟨X⟩(t) and ⟨P⟩(t) increases with τ_I, saturating at 0.7610 at long times.
Download figure:
Standard image High-resolution image

The P feedback strategy developed above requires τ_P = 0, a condition that is experimentally challenging to achieve due to the finite bandwidth of any feedback control loop. Therefore, we have also tested the performance of the feedback law when τ_P > 0, on order to investigate the robustness of this strategy. The effect of finite time delay on individual trajectories and on the average state evolution is shown in appendix F for several values of τ_P. We find that the ensemble average over trajectories for both quadratures, $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ , still converge to the target values, although over a longer timescales than for the ideal τ_P = 0 setting [here $\mathbb{E}\left[\cdot \right]$ denotes an expectation over trajectories (measurement outcomes)]. However, the individual trajectories no longer converge for finite τ_p values, and fluctuate around the target values. A detailed analysis of this behavior is given in appendix F. This general behavior of ensemble averages converging to the target while individual trajectories show final state fluctuations about the average, resembles the stabilization performance under the I feedback strategy that we discuss in the next subsection.

4.1.2. Integral feedback

Now we examine the dynamics obtained by setting α_p1 = α_p2 = 0 in equation (23), which corresponds to applying only integral control. The measurement current j(t) provides a noisy estimate of the oscillator position, so it is necessary to filter this in order to obtain a smoothed estimate of the error signal e(t). We use the following exponential filter with memory τ_I:

$\begin{equation}\mathcal{J}\left(t\right)=\frac{1}{{\tau }_{\mathrm{I}}}{\int }_{t-{\tau }_{\mathrm{I}}}^{t}\left(j\left(s\right)-2{x}_{g}\left(s\right)\right)\mathrm{exp}\left(-\frac{\left(t-s\right)}{{\tau }_{\mathrm{I}}}\right)\mathrm{d}s.\end{equation} \tag{ 28 }$

Our choices for the coefficients α_i1 and α_i2 in the presence of such an integral filter are motivated by the same factor as in the P feedback case above, namely to cancel as much of the measurement noise as possible. While it is not possible to do this exactly with I feedback, we show below that the choice α_i1(t) = 2kηC_xp(t) and α_i2(t) = −2kηV_x(t) does provide exponential convergence of the quadratures to their target values on average. As in the proportional feedback case, we also add a term γ(x_g(t)p + p_g(t)x) to the Hamiltonian H₀ to compensate for thermal damping.

The evolution of d⟨x⟩ and d⟨p⟩ with these feedback settings is given in appendix C.3. Converting these to the rotating frame and writing equations of motion for the deviations $\tilde {X}$ and $\tilde {P}$ yields:

$\begin{equation}\begin{aligned}\hfill \mathrm{d}\tilde {X}\left(t\right)& =-\gamma \tilde {X}\left(t\right)\mathrm{d}t-2\sqrt{k\eta }\left(\sqrt{k\eta }\mathcal{J}\left(t\right)\mathrm{d}t-\mathrm{d}W\left(t\right)\right)\left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right),\hfill \\ \hfill \mathrm{d}\tilde {P}\left(t\right)& =-\gamma \tilde {P}\left(t\right)\mathrm{d}t-2\sqrt{k\eta }\left(\sqrt{k\eta }\mathcal{J}\left(t\right)\mathrm{d}t-\mathrm{d}W\left(t\right)\right)\left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right).\hfill \end{aligned}\end{equation} \tag{ 29 }$

A typical evolution (trajectory), started from the same initial state as for the P feedback above, is shown in figure 5(b). We now see random fluctuations in the evolution of the quadrature expectations because the measurement noise has not been exactly cancelled by the I feedback. Indeed this is now not possible, since the measurement noise term dW(t) is arbitrarily varying while the integral feedback term is not. Consequently, single trajectories will fluctuate around the target values, preventing perfect state stabilization of individual evolutions. However, the average values of the quadratures (marked by the solid lines labeled X_a and P_a/mω in figure 5(b)) do converge exponentially to the goal values. In figure 5(b) and in subsequent figures where we show stochastic trajectories, we will state the 'maximum standard deviation' at steady state for these trajectories. The standard deviations of ⟨X⟩(t) and ⟨P⟩(t) (calculated over multiple trajectories) are the same but time-dependent and oscillatory at long times. However, this standard deviation is within a narrow range and thus we quote the maximum value over a time window in the steady state region (which is defined as when $\mathbb{E}\langle X\rangle \left(t\right)$ and $\mathbb{E}\langle P\rangle \left(t\right)$ reach constant values).

To analyze this behavior and prove the exponential convergence of the average over trajectories, we again write equation (29) in matrix form as dZ(t) = AZ(t)dt + b(t)dt + c(t)dW(t), with

$\begin{align}\hfill A& =\left[\begin{matrix}\hfill -\gamma \hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill -\gamma \hfill \end{matrix}\right],\quad b\left(t\right)=\left[\begin{matrix}\hfill -2k\eta \mathcal{J}\left(t\right)\left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\hfill \\ \hfill -2k\eta \mathcal{J}\left(t\right)\left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\hfill \end{matrix}\right]\hfill \\ \hfill c\left(t\right)& =\left[\begin{matrix}\hfill 2\sqrt{k\eta }\left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\hfill \\ \hfill 2\sqrt{k\eta }\left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\hfill \end{matrix}\right].\hfill \end{align}$

The solution to this system can be formally written as

$\begin{equation}Z\left(t\right)={\text{e}}^{-\gamma t}\enspace Z\left(0\right)+{\int }_{0}^{t}\mathrm{d}\tau \enspace {\text{e}}^{-\gamma \left(t-\tau \right)}\enspace b\left(\tau \right)+{\int }_{0}^{t}\mathrm{d}W\left(\tau \right){\text{e}}^{-\gamma \left(t-\tau \right)}\enspace c\left(\tau \right).\end{equation} \tag{ 30 }$

Note that as before, the second order moments evolve slower than cos(ωt), sin(ωt). Furthermore, since $\mathcal{J}\left(t\right)$ is a smoothed measurement current, it also evolves slowly on the timescale of an oscillator period, T. Thus, we may neglect the second term since the integral over the rapidly oscillating sinusoidal terms will average to zero for t ≫ T. We cannot make the same argument for the third term, since dW(t) does not have finite variation over any interval. This third term is in fact what causes fluctuations of individual quadrature trajectories around their setpoint values in figure 5(b). However, note that since c(τ)e^{−γ(t−τ)} is a non-anticipating function (alternatively, an adapted process that depends only on current and prior times, and independent of the Wiener process), we may conclude that the third term vanishes when averaged over many trajectories, i.e., $\mathbb{E}\left\{{\int }_{0}^{t}\mathrm{d}W\left(\tau \right){\text{e}}^{-\gamma \left(t-\tau \right)}\enspace c\left(\tau \right)\right\}=0$ [76]. This leaves only the first, exponentially decaying term, for the average quadrature values and is therefore the reason for the exponential convergence of the ensemble average to the target values. This analysis also shows that the rate of convergence is slower for I feedback than for P feedback, for which there is an additional contribution of −2kηV_x(t) to the convergence rate, see equation (27).

This first analysis of control of state stabilization for the harmonic oscillator has shown that when access to both x and p control is given, the performance of purely proportional feedback with zero time delay is not improved by adding integral feedback. Indeed, both P and I feedback strategies converge exponentially to the target state when an ensemble average over I feedback trajectories is taken. This shows that state estimation [6] is not necessary to drive a harmonic oscillator to an arbitrary quantum state in the presence of thermal noise. However, when comparing the P and I strategies, it is evident that the P feedback is advantageous for two reasons. The first is that with zero time delay there exists a proportional feedback law that can perfectly cancel the measurement noise perturbations to the system for each individual trajectory, whereas this can only be approximately canceled under an integral feedback strategy for an individual trajectory, resulting in fluctuations about the target mean quadrature values for any given trajectory. The second is a faster convergence for P feedback. Given the superior performance of P feedback over I feedback in this setting, we conclude that is not advantageous to consider a more general PI feedback protocol when P feedback with zero time delay is possible.

For time delays greater than the ideal τ_P = 0 the stabilization performance of P feedback strategy degrades, with individual trajectories fluctuating around the target quadrature expectation values and these fluctuations having greater variance as the time delay is increased, although the ensemble average still converges to the target state (appendix F). For time delay values ⪆0.2T the I feedback strategy becomes preferable due to the larger deviations from the target values for the P feedback strategy.

4.2. x control only

Our second analysis of control of state stabilization for the harmonic oscillator considers the case of x measurement with only a single control, namely feedback control in x. Under x control only, we set α_p2 = α_i2 = 0, and therefore have a single feedback operator, F₁ = x.

4.2.1. Proportional control

As before, we first consider proportional control alone, i.e., α_i1 is also set to zero. Our feedback operator is x, and thus the feedback applies a force. Ideally we want this force to be proportional to −(⟨p⟩(t) − p_g(t)) in order to cancel the measurement noise. However, since we are measuring only the position, we do not have direct access to the momentum observable. This is manifest in the dynamical equations in equation (23) by the fact that the only deterministic term involving α_p1 is the term −α_p1(2⟨x⟩(t − τ_P) − g(t − τ_P))dt in the equation for d⟨p⟩(t). This term does not appear to be useful for controlling the oscillator momentum, because it contains information about ⟨x⟩ rather than ⟨p⟩. Indeed, we find that the trajectories for evolution of the mean values do not show convergent behavior when implementing proportional x feedback with τ_P = 0. Noting that for a harmonic oscillator the average position and momentum have a T/4 relative delay (see also [52]), in the weak measurement and damping limit (k, γ ≪ mω²) we can take a delayed signal term ⟨x⟩(t − T/4) to be a good approximation to the scaled oscillator momentum −⟨p⟩(t)/(mω). This allows formulation of a good control law based on delayed proportional feedback with τ_P = T/4. One can then follow the same line of reasoning outlined above in section 4.1 to tune the strength and offset of the feedback coefficient in order to achieve noise cancellation. Specifically, we set α_p1 = −2kηV_xmω with τ_P = T/4. We similarly add a term γp_g(t)x to H₀ in order to compensate for thermal damping. Note that full compensation of the effects of thermal damping requires adding a term γ(x_g(t)p + p_g(t)x), however, consistent with the assumption in this subsection that there is no direct control over the oscillator momentum, we add only the term γp_g(t)x. The resulting dynamical equations for the mean quadratures are given in appendix C.4. When these are transformed into the rotating frame the evolution of the deviations become:

$\begin{align}\hfill \mathrm{d}\tilde {X}\left(t\right)\approx & 4k\eta {V}_{x}\left(t\right)\left[-m\omega \tilde {X}\left(t\right)\mathrm{sin}\left(\omega t\right)+\tilde {P}\left(t\right)\mathrm{cos}\left(\omega t\right)\right]\mathrm{sin}\left(\omega t\right)/m\omega \enspace \mathrm{d}t-\gamma \tilde {X}\left(t\right)\mathrm{d}t\hfill \\ \hfill & +\left[-\gamma {X}_{g}\enspace {\mathrm{cos}}^{2}\left(\omega t\right)-\gamma {P}_{g}\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)/m\omega \right]\mathrm{d}t\hfill \\ \hfill & +2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\mathrm{cos}\left(\omega t\right)-2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right)\mathrm{sin}\left(\omega t\right)/m\omega \hfill \end{align} \tag{ 31a }$

$\begin{align}\hfill \qquad \approx & \left[-2k\eta {V}_{x}\left(t\right)\tilde {X}\left(t\right)-\gamma \tilde {X}\left(t\right)-\frac{\gamma }{2}{X}_{g}\right]\mathrm{d}t\hfill \\ \hfill & +\left(2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\mathrm{d}W\left(t\right)+2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\hfill \end{align} \tag{ 31b }$

$\begin{align}\hfill \mathrm{d}\tilde {P}\left(t\right)\approx & 4k\eta {V}_{x}\left(t\right)\left[-\tilde {P}\left(t\right)\mathrm{cos}\left(\omega t\right)+m\omega \tilde {X}\left(t\right)\mathrm{sin}\left(\omega t\right)\right]\mathrm{cos}\left(\omega t\right)\mathrm{d}t-\gamma \tilde {P}\left(t\right)\mathrm{d}t\hfill \\ \hfill & +\left[-\gamma {P}_{g}\enspace {\mathrm{sin}}^{2}\left(\omega t\right)-\gamma m\omega {X}_{g}\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)\right]\mathrm{d}t\hfill \\ \hfill & +2m\omega \sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\mathrm{sin}\left(\omega t\right)+2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right)\mathrm{cos}\left(\omega t\right)\hfill \end{align} \tag{ 31c }$

$\begin{align}\hfill \qquad \approx & \left[-2k\eta {V}_{x}\left(t\right)\tilde {P}\left(t\right)-\gamma \tilde {P}\left(t\right)-\frac{\gamma }{2}{P}_{g}\right]\mathrm{d}t\hfill \\ \hfill & +\left(2m\omega \sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\mathrm{d}W\left(t\right)-2\sqrt{k\eta }m\omega {V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)\mathrm{d}W\left(t-\frac{T}{4}\right),\hfill \end{align} \tag{ 31d }$

where in the second line of each equation we have applied the period-averaging approximation to the deterministic terms, and regrouped the stochastic terms.

The inability to actuate the oscillator momentum in this situation introduces two negative features into these equations relative to equation (26), for which both x and p control are available. The first is that we cannot perfectly cancel the measurement noise, resulting in the presence of stochastic terms. The second is that we cannot simply compensate for the thermal damping of oscillator momentum by adding a term γx_gp to H₀. This leads to the $-\frac{\gamma }{2}{X}_{g}\enspace \mathrm{d}t$ and $-\frac{\gamma }{2}{P}_{g}\enspace \mathrm{d}t$ terms in the period-averaged equations. The first point is not a serious hindrance to stabilization, because in the weak measurement limit the effect of the noise is small and leads primarily to fluctuations around the target values. However, the second point is more serious, since the inability to suppress thermal damping means that the system will be driven to a state that is different from the target state. In appendix E we show how the target state values X_g and P_g can simply be scaled to compensate for this incorrect steady state.

With this compensation trick solving the thermal damping issue for this constrained control setting, we can obtain very good stabilization behavior of individual trajectories to the desired target values, with relatively small fluctuations about these, as shown in figure 9(a) in appendix E. This figure shows a typical trajectory under this proportional control law, incorporating the above scaling of the target X_g and P_g values. It is important to note that we are simulating the dynamics here without any of the approximations used in the above analysis; i.e., using the time-delayed feedback current and without invoking the period-averaging approximation. Figure 9(a) shows that the time-delayed signal does indeed provide a good estimate of the oscillator momentum, evidently resulting in some but not complete suppression of measurement noise, as well as exponential convergence of the quadrature means to their goal values. Thus despite the reduced number of control degrees of freedom, one can nevertheless still achieve exponential convergence of the quantum expectations to their target values using P feedback, with zero bias from the target values and relatively small standard deviation (see figure 9(a)).

As in the case of x and p actuation, this P feedback strategy requires a precise value for the feedback loop time delay τ_P. Here the desired value of τ_P is non-zero, and is thus experimentally less demanding to realize than the ideal P feedback strategy with x and p actuation for which τ_P = 0 (section 4.1.1). However, it might still be challenging to engineer a feedback loop with a precise value of delay τ_P = T/4. To assess the robustness of the strategy with respect to uncertainties in τ_P, we also analyzed the stabilization performance of this P feedback strategy for larger time delays, i.e., τ_P = T/4 + . Results for several values of are shown in appendix F, where it is seen that in this case the stabilization performance degrades for all > 0. The fluctuations of individual trajectories of quadrature expectation increase with , and there is also a bias in the long-time values of these expectations; i.e., the ensemble average values $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ do not converge to the target values. This error in convergence is appreciable even for offsets as small as = 0.05T and increases with .

4.2.2. Integral control

We now study the case of integral feedback when only one feedback operator is available, again choosing F₁ = x. On setting α_p2 = α_i2 = α_p1 = 0 in equation (23), it is apparent that the only control handle into the system now comes from the $-{\alpha }_{\mathrm{i}\mathrm{1}}\mathcal{J}\left(t\right)$ term. As we learned above, the key to stabilizing the system with F₁ = x alone is to construct an estimator of the oscillator momentum. For P feedback we used a time delay to achieve this. Here we will construct an estimator with the integral filter.

Following Doherty et al [52], we first modulate the measurement signal to form estimates of the oscillator quadrature deviations in the rotating frame:

$\begin{align}\hfill {\mathcal{J}}_{\mathrm{X}}\left(t\right)& =\frac{1}{{\tau }_{\mathrm{I}}}{\int }_{t-{\tau }_{\mathrm{I}}}^{t}\left(j\left(s\right)-2{x}_{g}\left(s\right)\right)\mathrm{cos}\left(\omega s\right)\mathrm{d}s\approx \tilde {X}\left(t\right),\hfill \end{align} \tag{ 32a }$

$\begin{align}\hfill {\mathcal{J}}_{\mathrm{P}}\left(t\right)& =\frac{m\omega }{{\tau }_{\mathrm{I}}}{\int }_{t-{\tau }_{\mathrm{I}}}^{t}\left(j\left(s\right)-2{x}_{g}\left(s\right)\right)\mathrm{sin}\left(\omega s\right)\mathrm{d}s\approx \tilde {P}\left(t\right).\hfill \end{align} \tag{ 32b }$

Using equation (24) these integrals of the measurement record can be combined to yield an estimator of the error between ⟨p⟩(t) and p_g(t):

$\begin{equation}\mathcal{J}\left(t\right)=-m\omega {\mathcal{J}}_{\mathrm{X}}\left(t\right)\mathrm{sin}\left(\omega t\right)+{\mathcal{J}}_{\mathrm{P}}\left(t\right)\mathrm{cos}\left(\omega t\right).\end{equation} \tag{ 33 }$

We choose α_i1(t) = 4kηV_x(t) to achieve measurement noise cancellation and convergence to the target state. The resulting dynamic evolution of the quadrature means are given in appendix C.5, and the corresponding evolution of the deviations in the rotating frame is given by:

$\begin{align}\hfill \mathrm{d}\tilde {X}\left(t\right)& =\left[-\gamma \tilde {X}\left(t\right)-\gamma {X}_{g}\enspace {\mathrm{cos}}^{2}\left(\omega t\right)-\frac{\gamma {P}_{g}}{m\omega }\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)+\frac{4k\eta {V}_{x}\left(t\right)}{m\omega }\mathcal{J}\left(t\right)\mathrm{sin}\left(\omega t\right)\right]\mathrm{d}t\hfill \\ \hfill & \quad +2\sqrt{k\eta }\left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\mathrm{d}W\left(t\right)\hfill \\ \hfill & \approx \left[-2k\eta {V}_{x}\left(t\right)\tilde {X}\left(t\right)-\gamma \tilde {X}\left(t\right)-\frac{\gamma }{2}{X}_{g}\right]\mathrm{d}t+2\sqrt{k\eta }\left({V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\mathrm{d}W\left(t\right)\hfill \end{align} \tag{ 34a }$

$\begin{align}\hfill \mathrm{d}\tilde {P}\left(t\right)& =\left[-\gamma \tilde {P}\left(t\right)-\gamma {P}_{g}\enspace {\mathrm{sin}}^{2}\left(\omega t\right)-\gamma m\omega {X}_{g}\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)-4k\eta {V}_{x}\left(t\right)\mathcal{J}\left(t\right)\mathrm{cos}\left(\omega t\right)\right]\mathrm{d}t\hfill \\ \hfill & \quad +2\sqrt{k\eta }\left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\mathrm{d}W\left(t\right)\hfill \\ \hfill & \approx \left[-2k\eta {V}_{x}\left(t\right)\tilde {P}\left(t\right)-\gamma \tilde {P}\left(t\right)-\frac{\gamma }{2}{P}_{g}\right]\mathrm{d}t+2\sqrt{k\eta }\left(m\omega {V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\mathrm{d}W\left(t\right),\hfill \end{align} \tag{ 34b }$

where in the second line of each equation we have used the period-averaging approximation and the approximations ${\mathcal{J}}_{X}\left(t\right)\approx \tilde {X}\left(t\right)$ and ${\mathcal{J}}_{\mathrm{P}}\left(t\right)\approx \tilde {P}\left(t\right)$ .

These equations have the same form as equation (31), including exactly the same deterministic terms. Therefore, as shown in appendix E we know that the ensemble average steady state for this evolution will not be the target values (X_g, P_g). However, as in that case, we can compensate for these scale factors by adjusting the target values. Once this compensation is made, the system converges exponentially towards the target values with fluctuations. This is evidenced in the simulations shown in figure 9(b) which show similar relatively small fluctuations as for P feedback, with standard deviation 0.1676 ± 0.002 about the target values at long times.

Both the P and I feedback trajectories shown in figure 9 show stochastic noise. Since the feedback in the integral strategy is conditioned on a tempered version of the noise instead of on the instantaneous noise, we can expect that this smoothing of the noise should give the integral strategy a relative advantage over the purely proportional strategy here. While the noise does appear smaller in the I trajectory (compare figure 9(b) with 9(a)), it is difficult ascertain the effect of this on the overall performance of the control strategy by examining single trajectories. To enable a quantitative comparison between the performance of the two control strategies in this situation, we therefore define the following average error metric that quantifies the deviation from the control goals when averaged over all measurement trajectories:

$\begin{equation}{\Delta}\left(t\right)=\sqrt{\frac{1}{2}\mathbb{E}\left[m\omega \tilde {X}{\left(t\right)}^{2}+\frac{\tilde {P}{\left(t\right)}^{2}}{m\omega }\right]}.\end{equation} \tag{ 35 }$

We estimate this error by simulating a large ensemble of trajectories with P or I feedback control. In figure 6 we plot the long-time value of this average error, i.e., when it reaches a constant value, as a function of the measurement efficiency, η. This plot shows that I feedback consistently gives a smaller error and thus performs better than P feedback over essentially the full range of measurement efficiency η.

In summary, when we only have access to the F₁ = x control operator, we do not have sufficient control degrees of freedom to follow the strategy of both cancelling the noise and engineering convergence to the target values, as was possible for P feedback in section 4.1. However, we have seen that by forming momentum estimators (via use of time delay in the P feedback case, and via integral approximations of the quadratures in the I feedback case), we can still achieve effective control, with exponential convergence as before. We find that with this approach, both P and I feedback achieve similar control accuracy, with I feedback performing slightly better on average and the difference increasing with greater measurement efficiency η. Moreover, both of these P and I feedback strategies show the same rate of convergence to the target quadrature mean values, as is evident from the fact that (within the period-averaging approximation) equations (31) and (34) have the same deterministic terms. However, neither of these strategies guarantee convergence of individual trajectories. Also we note that the P feedback strategy is very sensitive to the exact value of the time delay, for which the ideal value is τ_P = T/4. Deviations from this ideal value result in inadequate stabilization performance, with failure to reach the target state even on average.

Given the similar performance of P and I feedback in this scenario and the lack of robustness of P feedback to variations in the time delay, one might not expect a significant benefit to combining the two to construct a PI feedback strategy. To assess this, we write ${\tilde {\alpha }}_{\mathrm{p}\mathrm{1}}\left(t\right)=\left(1-\theta \right){\alpha }_{\mathrm{p}\mathrm{1}}\left(t\right)$ and ${\tilde {\alpha }}_{\mathrm{i}\mathrm{1}}\left(t\right)=\theta {\alpha }_{\mathrm{i}\mathrm{1}}\left(t\right)$ where α_p1(t) and α_i1(t) are the values determined above, and θ ∈ [0, 1] is a mixing ratio quantifying the combination of the two strategies. In figure 6 we plot the long time control error for θ = 0.8 (the long time control error is minimum, and almost the same, for any value of θ in the interval [0.8, 1]) and note that indeed, there is little statistically significant benefit to combining P and I feedback in this scenario.

5. Discussion and conclusions

We have presented and implemented a formalism for modeling proportional and integral (PI) feedback control in quantum systems for which, as in the case of classical PI feedback control, we allow the feedback to be tuned from a purely proportional feedback strategy (P feedback, including the possibility of delay) to a purely integral feedback (I feedback), with a combined strategy at any point in between (PI feedback). In this approach both proportional and integral feedback components are defined in terms of the measurement outcomes only, i.e., no dependence on knowledge of the quantum state is assumed. Consequently we did not seek globally optimal protocols, rather the best performance within the options of P, I, and PI feedback, given the ability to feedback quantum operations based only on the measurement record. For a given implementation we then first compared the performance of separate P feedback and I feedback control strategies, with and without the presence of time delay in the former, and then carried out a PI feedback strategy, following an assessment of whether or not this might be beneficial.

We implemented this quantum PI feedback approach in this work for two canonical quantum control problems, namely entanglement generation of remote qubits by non-local measurements with local feedback operations, and stabilization of a harmonic oscillator to arbitrary target values of its quadrature expectations when subject to thermal noise.

Our first case study was the generation of entanglement by measurement of collective operators of two non-interacting qubits, combined with local feedback operations, for arbitrary measurement efficiency η ⩽ 1 and time-independent proportionality constant α_p. Unlike the situation for η = 1 and more general time-dependent P feedback [39], our more restricted—but experimentally relevant—case is unable to completely cancel measurement noise, regardless of the value of η. Here we found that an I feedback strategy can improve on P feedback and achieve superior performance, essentially because an I strategy is able to formulate a smoothed estimate of the error signal by means of the integral filter. This situation is reminiscent of PI feedback control in classical systems [55] and this case provides strong motivation for the formulation of a general PI feedback law that combines the P and I feedback strategies. We numerically determined an optimal mixing ratio between P and I feedback for this problem of remote entanglement generation, showed that this optimal value can depend on the overall feedback strength and system Hamiltonian, and demonstrated that PI feedback can be beneficial over both the I and P feedback strategies in some cases.

In the case of the harmonic oscillator, as in previous work on cooling of a harmonic oscillator [6], we studied two settings of feedback control based on measurement of the position degree of freedom x, which is generally easier to measure than the momentum p. In the first setting, it is possible to actuate both x and p degrees of freedom of the oscillator, while in the second regime it is possible to only actuate x, i.e., to apply a force. The first setting allows formulation of a P feedback strategy that can perfectly cancel the measurement backaction noise entering the system, resulting in a deterministic evolution of the average state [6]. In this setting, adding a Hamiltonian drive to compensate for thermal damping results in a P feedback strategy that allows any state to be exponentially driven to the target quadrature expectation values, without any measurement induced fluctuations. In contrast, while an I feedback strategy that is exponentially convergent can also be formulated, integral feedback terms are regular and cannot completely cancel the measurement noise. This results in a somewhat slower rate of stabilization and considerable fluctuations in the quadrature expectations for individual trajectories, implying that a PI feedback strategy is not as effective as a P feedback strategy with zero time delay. However the ensemble average does converge exponentially to the target quadratures, indicating zero bias of the ensemble in the long-time quadrature expectations.

In the second harmonic oscillator setting, with control only over the x degree of freedom, complete cancellation of the measurement noise can no longer be made, even in a P feedback strategy. However, by using a time delay in P feedback and integral filters in I feedback to obtain estimates of the time-dependent oscillator momentum, we found that it is nevertheless still possible to formulate good feedback control laws that achieve exponential convergence of quadrature expectation values on average, with relatively small measurement noise induced fluctuations of individual trajectories around their target values. In this case, we consistently found a small advantage of I feedback over P feedback for all efficiencies η, with the former also showing smaller fluctuations around the goal. This was seen to stem from the fact that I feedback can derive a smoother estimate of the oscillator momentum through use of an integral filter, and thus allows us to engineer a system with more controlled and smaller fluctuations around the target quadrature mean values.

Thus for the harmonic oscillator state stabilization, we find the best performance with a pure P strategy when both x and p controls are available, and the best performance with a pure I feedback strategy when only x control is available. We found little significant advantage in formulating a general mixed PI feedback strategy for the harmonic oscillator state stabilization. Although we make no claims about the optimality of any of these feedback control strategies for the harmonic oscillator, a significant feature of our analysis is the proof that all of them lead to exponential convergence of the expectation values of the oscillator quadratures to their goal values. We emphasize that this convergence analysis has been restricted to the parameter regime where a period-averaging (i.e., rotating wave) approximation is valid. It is possible that this landscape of PI feedback performance could change outside this regime, which is a potential topic for further study.

We also examined the robustness of the P feedback strategies to imperfect time delay, investigating the effects of larger values of τ_P than specified by the ideal control law. We found that the harmonic oscillator state stabilization example when both x and p actuation are available is the most robust to finite time delays, with the quadrature expectations at long time having zero bias from their target values (i.e., $\mathbb{E}\langle X\rangle \left(t\to \infty \right)={X}_{g}$ and $\mathbb{E}\langle P\rangle \left(t\to \infty \right)={P}_{g}$ ), but with fluctuations from the target values that increase with time delay. Meanwhile, both the harmonic oscillator with only x actuation and the two qubit remote entanglement example are very sensitive to deviations of τ_P away from the ideal specified value, with performance degrading rapidly as the deviation increases. For the latter cases, the I feedback strategy will therefore be preferred when the perfect time delay condition cannot be met.

These case studies reveal a key difference between the benefits of PI feedback in the quantum and classical domains. In the quantum case, there is an unavoidable correlation between the noise experienced by the system and the noise in the measurement signal. This is not always the case in classical systems, where the 'process noise' that the system experiences is often independent of the measurement noise. This difference means that P feedback strategies can play a unique and potentially more powerful role in the quantum domain than they typically do in the classical domain. In particular, in some circumstances, depending on the feedback actuation degrees of freedom, a P feedback strategy can perfectly cancel the noise that the system experiences, while an I feedback strategy can only approximately cancel this noise. Of course, one can get the same behavior in special classical systems, e.g., linear systems with zero process noise. In cases where this perfect cancellation is not possible, whether this is due to time delay or other constraints on the feedback action, we saw that I feedback can outperform P feedback, because it provides a smoothed version of the measurement/process noise. This beneficial value of I feedback is similar to that seen in classical PI feedback control.

Several possibilities for extending this work are immediately evident. Firstly, formulating optimal forms of PI feedback in the quantum domain would be beneficial, even for paradigmatic systems that are analytically tractable like the harmonic oscillator example treated here. The results in the current work indicate that such optimality studies would be particularly useful for feedback control in situations with inefficient measurements (see e.g., [54]). Secondly, the development of heuristic methods for tuning the optimal proportions of P and I feedback for any system, analogous to those that exist for classical PI feedback control [55] is an interesting direction. Here, it would be of interest to determine the optimal strategies under constraints of finite measurement and feedback bandwidth, in contrast to the infinite bandwidth controls implicitly assumed in this work, but still without state estimation. Exploration of robust methods to address the implementation of differential control terms to allow implementation of quantum PID control would also be valuable. Finally, our demonstration of the beneficial effects of integral control strategies for generation of entangled states of qubits under inefficient measurements within the range of current capabilities [64], indicate good prospects for experimental demonstration of quantum PI feedback in the near future.

Acknowledgments

HC acknowledges the China Scholarship Council (Grant 201706230189) for support of an exchange studentship at the University of California, Berkeley. HL, FM, and KBW were supported in part by the DARPA QUEST program. LM was supported by the National Science Foundation Graduate Fellowship Grant No. 1106400 and the Berkeley Fellowship for Graduate Study. MS was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under the Quantum Computing Application Teams. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

Appendix A.: Two-qubit entanglement generation

The SME that describes the evolution of the two-qubit system for τ_P = 0 is

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[H,\rho \right]+k\mathcal{D}\left[{L}_{z}\right]\rho -\mathrm{i}{\alpha }_{\mathrm{p}}\left[{L}_{x},{L}_{z}\rho +\rho {L}_{z}\right]+\frac{{\alpha }_{\mathrm{p}}^{2}}{k\eta }\mathcal{D}\left[{L}_{x}\right]\rho -\mathrm{i}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\left[{L}_{x},\rho \right]\right\}\mathrm{d}t\hfill \\ \hfill & +\mathrm{d}W\mathcal{H}\left[\sqrt{\eta k}{L}_{z}-\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{\eta k}}{L}_{x}\right]\rho ,\hfill \end{align} \tag{ A1 }$

and for τ_P > 0

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[H,\rho \right]+k\mathcal{D}\left[{L}_{z}\right]\rho -\mathrm{i}{\alpha }_{\mathrm{p}}j\left(t-{\tau }_{\mathrm{P}}\right)\left[{L}_{x},\rho \right]+\frac{{\alpha }_{\mathrm{p}}^{2}}{k\eta }\mathcal{D}\left[{L}_{x}\right]\rho -\mathrm{i}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\left[{L}_{x},\rho \right]\right\}\mathrm{d}t\hfill \\ \hfill & +\sqrt{\eta k}\enspace \mathrm{d}W\mathcal{H}\left[{L}_{z}\right]\rho ,\hfill \end{align} \tag{ A2 }$

where α_p and α_i are the proportional and integral feedback coefficients, and as mentioned in the main text above, we have set the goal g(t) = ⟨T₀|L_z|T₀⟩ = 0.

In this appendix we write in full the non-linear stochastic equation of motion of the triplet state populations and coherences for the two-qubit example treated in the main text. We keep things general and do not assume h₁ = h₂ in this appendix.

The measurement current is

$\begin{equation}I\left(t\right)=2\langle {L}_{z}\rangle \left(t\right)+\xi \left(t\right)/\sqrt{k\eta }.\end{equation} \tag{ A3 }$

We denote the populations of the triplet and singlet two-qubit states as T_± = tr(ρ|T_±⟩⟨T_±|), T₀ = tr(ρ|T₀⟩⟨T₀|), and T_S = tr(ρ|S⟩⟨S|). If the initial state is in the triplet subspace, the subsequent evolution will stay within this subspace under the action of the half-parity measurement and local feedback operations ∝σ_x1 + σ_x2.

The evolution of the triplet state populations is given by

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{-1}=& \left[-\sqrt{2}{\alpha }_{\mathrm{p}}\enspace \text{Im}\enspace {T}_{0,-1}+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left({T}_{0}-{T}_{-1}-\text{Re}\enspace {T}_{-1,1}\right)+\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\text{Im}\enspace {T}_{0,-1}\right]\mathrm{d}t\hfill \\ \hfill & -\left[2\sqrt{\eta k}{T}_{-1}\left(1+\langle {L}_{z}\rangle \left(t\right)\right)-\frac{\sqrt{2}{\alpha }_{p}}{\sqrt{\eta k}}\enspace \text{Im}\enspace {T}_{0,-1}\right]\mathrm{d}W\left(t\right)\hfill \end{aligned}\end{equation} \tag{ A4 }$

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{0}=& \left[2\left({h}_{1}-{h}_{2}\right)\text{Im}\enspace {T}_{0,S}+\sqrt{2}{\alpha }_{\mathrm{p}}\left(\text{Im}\enspace {T}_{0,-1}-\text{Im}\enspace {T}_{0,1}\right)+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left({T}_{-1}+{T}_{1}-2{T}_{0}+2\enspace \text{Re}\enspace {T}_{-1,1}\right)\right.\hfill \\ \hfill & \left.-\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\left(\text{Im}\enspace {T}_{0,1}+\text{Im}\enspace {T}_{0,-1}\right)\right]\mathrm{d}t-\left[2\sqrt{\eta k}{T}_{0}\langle {L}_{z}\rangle \left(t\right)+\frac{\sqrt{2}{\alpha }_{\mathrm{p}}}{\sqrt{\eta k}}\left(\text{Im}\enspace {T}_{0,1}+\text{Im}\enspace {T}_{0,-1}\right)\right]\mathrm{d}W\left(t\right)\hfill \end{aligned}\end{equation} \tag{ A5 }$

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{1}=& \left[\sqrt{2}{\alpha }_{\mathrm{p}}\enspace \text{Im}\enspace {T}_{0,1}+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left({T}_{0}-{T}_{1}-\text{Re}\enspace {T}_{1,-1}\right)+\sqrt{2}{\alpha }_{\mathrm{i}}\mathcal{J}\left(t\right)\text{Im}\enspace {T}_{0,1}\right]\mathrm{d}t\hfill \\ \hfill & +\left[2\sqrt{\eta k}{T}_{1}\left(1-\langle {L}_{z}\rangle \left(t\right)\right)+\frac{\sqrt{2}{\alpha }_{\mathrm{p}}}{\sqrt{\eta k}}\enspace \text{Im}\enspace {T}_{0,1}\right]\mathrm{d}W\left(t\right).\hfill \end{aligned}\end{equation} \tag{ A6 }$

The corresponding coherence terms within the triplet subspace are

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{1,-1}=& \left\{2\left[\mathrm{i}\left({h}_{1}+{h}_{2}\right)-k\right]{T}_{1,-1}+\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{2}}\left({T}_{0,-1}+{T}_{1,0}\right)+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left[{T}_{0}-{T}_{1,-1}-\frac{1}{2}\left({T}_{1}+{T}_{-1}\right)\right]\right.\hfill \\ \hfill & \left.-\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{0,-1}-{T}_{1,0}\right)\right\}\mathrm{d}t+\left[\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{2\eta k}}\left({T}_{1,0}+{T}_{0,-1}\right)-2\sqrt{\eta k}\langle {L}_{z}\rangle \left(t\right){T}_{1,-1}\right]\mathrm{d}W\left(t\right)\hfill \end{aligned}\end{equation} \tag{ A7 }$

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{0,1}=& \left\{-\left[\mathrm{i}\left({h}_{1}+{h}_{2}\right)+\frac{1}{2}k\right]{T}_{0,1}+\mathrm{i}\left({h}_{1}-{h}_{2}\right){T}_{S,1}-\mathrm{i}\sqrt{2}{\alpha }_{\mathrm{p}}{T}_{1}+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left({T}_{-1,0}+{T}_{1,0}-\frac{1}{2}{T}_{0,-1}-\frac{3}{2}{T}_{0,1}\right)\right.\hfill \\ \hfill & \left.-\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{1}-{T}_{0}+{T}_{1,-1}\right)\right\}\mathrm{d}t+\left[\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{2\eta k}}\left({T}_{0}-{T}_{1}-{T}_{-1,1}\right)+\sqrt{\eta k}\left(1-2\langle {L}_{z}\rangle \left(t\right)\right){T}_{0,1}\right]\mathrm{d}W\left(t\right)\hfill \end{aligned}\end{equation} \tag{ A8 }$

$\begin{equation}\begin{aligned}\hfill \mathrm{d}{T}_{0,-1}=& \left\{\left[\mathrm{i}\left({h}_{1}+{h}_{2}\right)-\frac{1}{2}k\right]{T}_{0,-1}+\left({h}_{1}-{h}_{2}\right){T}_{S,-1}+\mathrm{i}\sqrt{2}{\alpha }_{\mathrm{p}}{T}_{-1}+\frac{{\alpha }_{\mathrm{p}}^{2}}{2k\eta }\left({T}_{1,0}+{T}_{-1,0}-\frac{3}{2}{T}_{0,-1}-\frac{1}{2}{T}_{0,1}\right)\right.\hfill \\ \hfill & \left.-\frac{\mathrm{i}{\alpha }_{\mathrm{i}}}{\sqrt{2}}\mathcal{J}\left(t\right)\left({T}_{1,-1}+{T}_{-1}- {T}_{0}\right)\right\}\mathrm{d}t+\left[\frac{\mathrm{i}{\alpha }_{\mathrm{p}}}{\sqrt{2\eta k}}\left({T}_{0}-{T}_{-1}-{T}_{1,-1}\right)- \sqrt{\eta k}\left(1+ 2\langle {L}_{z}\rangle \left(t\right)\right){T}_{0,-1}\right]\mathrm{d}W\left(t\right).\hfill \end{aligned}\end{equation} \tag{ A9 }$

T_s,i are coherences with the singlet state, which come into play if h₁ ≠ h₂.

This system of nonlinear SDEs cannot be solved for directly, except in the special case of P feedback with zero time delay, i.e., α_p > 0, α_i = 0, τ_P = 0. In this case, the SDEs are Markovian and we can directly get the evolution equations for the ensemble average by simply discarding the stochastic terms [77]. In this case, we can solve for the steady state of $\mathbb{E}{T}_{0}\left(t\right)$ to get

$\begin{equation}\mathbb{E}{T}_{0}\left(t\to \infty \right)=\frac{4\eta {\left({h}_{1}+{h}_{2}\right)}^{2}+{k}^{2}\eta +8{\eta }^{2}{k}^{2}+{\alpha }_{\mathrm{p}}^{2}}{12{\left({h}_{1}+{h}_{2}\right)}^{2}+3{k}^{2}\eta +8{\eta }^{2}{k}^{2}+3{\alpha }_{\mathrm{p}}^{2}}.\end{equation} \tag{ A10 }$

This expression shows that the steady state average population in the desired state increases with decreasing α_p. However, from simulations we also see that the system takes longer to converge to the steady state as α_p decreases. Finally, we note that $\mathbb{E}\left[{T}_{0}\left(t\to \infty \right)\right]{< }1$ always.

Appendix B.: SME of harmonic oscillator

For τ_P > 0, the evolution of the system with PI feedback control is obtained from equation (12) as

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[{H}_{0},\rho \left(t\right)\right]+k\mathcal{D}\left[x\right]\rho \left(t\right)+2\gamma \left(N+1\right)\mathcal{D}\left[a\right]\rho +2\gamma N\mathcal{D}\left[{a}^{{\dagger}}\right]\rho +\frac{{\alpha }_{\mathrm{p}\mathrm{1}}^{2}}{k\eta }\mathcal{D}\left[x\right]\rho \left(t\right)+\frac{{\alpha }_{\mathrm{p}\mathrm{2}}^{2}}{k\eta }\mathcal{D}\left[p\right]\rho \left(t\right)\right.\hfill \\ \hfill & \left.-\mathrm{i}\left({\alpha }_{\mathrm{i}\mathrm{1}}\mathcal{J}\left(t\right)+{\alpha }_{\mathrm{p}\mathrm{1}}e\left(t-{\tau }_{\mathrm{P}}\right)\right)\left[x,\rho \right]-\mathrm{i}\left({\alpha }_{\mathrm{i}\mathrm{2}}\mathcal{J}\left(t\right)+{\alpha }_{\mathrm{p}\mathrm{2}}e\left(t-{\tau }_{\mathrm{P}}\right)\right)\left[p,\rho \right]\right\}\mathrm{d}t\hfill \\ \hfill & +\sqrt{k\eta }\mathcal{H}\left[x\right]\rho \left(t\right)\mathrm{d}W\left(t\right).\hfill \end{align} \tag{ B1 }$

For τ_P = 0, the evolution of the system with PI feedback control is obtained from equation (13) as

$\begin{align}\hfill \mathrm{d}\rho \left(t\right)=& \left\{-\mathrm{i}\left[{H}_{0},\rho \left(t\right)\right]+k\mathcal{D}\left[x\right]\rho \left(t\right)+2\gamma \left(N+1\right)\mathcal{D}\left[a\right]\rho +2\gamma N\mathcal{D}\left[{a}^{{\dagger}}\right]\rho +\frac{{\alpha }_{\mathrm{p}\mathrm{1}}^{2}}{k\eta }\mathcal{D}\left[x\right]\rho \left(t\right)+\frac{{\alpha }_{\mathrm{p}\mathrm{2}}^{2}}{k\eta }\mathcal{D}\left[p\right]\rho \left(t\right)\right.\hfill \\ \hfill & \left.-\mathrm{i}\left({\alpha }_{\mathrm{i}\mathrm{1}}\mathcal{J}\left(t\right)-{\alpha }_{\mathrm{p}\mathrm{1}}g\left(t\right)\right)\left[x,\rho \right]-\mathrm{i}\left({\alpha }_{\mathrm{i}\mathrm{2}}\mathcal{J}\left(t\right)-{\alpha }_{\mathrm{p}\mathrm{2}}g\left(t\right)\right)\left[p,\rho \right]-\mathrm{i}\sqrt{k}{\alpha }_{\mathrm{p}\mathrm{1}}\left[x,x\rho \left(t\right)+\rho \left(t\right)x\right]\right.\hfill \\ \hfill & \left.-\mathrm{i}\sqrt{k}{\alpha }_{\mathrm{p}\mathrm{2}}\left[p,x\rho \left(t\right)+\rho \left(t\right)x\right]\right\}\mathrm{d}t+\mathcal{H}\left[\sqrt{k\eta }x-\mathrm{i}\frac{{\alpha }_{\mathrm{p}\mathrm{1}}x+{\alpha }_{\mathrm{p}\mathrm{2}}p}{\sqrt{k\eta }}\right]\rho \left(t\right)\mathrm{d}W\left(t\right).\hfill \end{align} \tag{ B2 }$

In both cases g(t) is a goal that we define in the main text, with e(t) the corresponding error signal. The proportional feedback component is the same as in reference [6], except that in this work we also consider a time delay τ_P > 0 in the feedback loop.

Appendix C.: Equations for first and second moments of harmonic oscillator under PI feedback

A.1. Second moments

The equations of motion of the second moments of the oscillator can be derived by evaluation of tr[V_x dρ(t)] = tr[(x − ⟨x⟩)²dρ(t)], etc. These evolve as

$\begin{align}\hfill {\dot {V}}_{x}& =-2\gamma {V}_{x}+\gamma \left(2N+1\right)/m\omega +\left(2/m\right){C}_{xp}-4k\eta {V}_{x}^{2},\hfill \\ \hfill {\dot {V}}_{p}& =-2\gamma {V}_{p}+\gamma \left(2N+1\right)/m\omega -2m{\omega }^{2}{C}_{xp}-4k\eta {C}_{xp}^{2}+k,\hfill \\ \hfill {\dot {C}}_{xp}& =-4\gamma {C}_{xp}+{V}_{p}/m-m{\omega }^{2}{V}_{x}-4k\eta {C}_{xp}{V}_{x}.\hfill \end{align} \tag{ C1a }$

Note that there is no dependence of these equations on the first moments, the feedback operator or on the measurement record. Figure 7 shows representative evolution of these second moments for system parameters used in the main text (m = ω = N = 1, k = γ = 1/50, η = 0.4).

A.2. First moments: x and p control, proportional feedback

The evolution of the first moments of the oscillator with x and p actuation and proportional feedback with the feedback coefficients specified in section 4.1.1 is given by

$\begin{align}\hfill \mathrm{d}\langle x\rangle \left(t\right)=& \frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle x\rangle \left(t\right)-{x}_{g}\left(t\right)\right)\mathrm{d}t-4k\eta {V}_{x}\left(t\right)\left(\langle x\rangle \left(t\right)-{x}_{g}\left(t\right)\right)\mathrm{d}t,\hfill \end{align} \tag{ C2a }$

$\begin{align}\hfill \mathrm{d}\langle p\rangle \left(t\right)=& -m{\omega }^{2}\langle x\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t-4k\eta {C}_{xp}\left(t\right)\left(\langle x\rangle \left(t\right)-{x}_{g}\left(t\right)\right)\mathrm{d}t.\hfill \end{align} \tag{ C2b }$

A.3. First moments: x and p control, integral feedback

The evolution of the first moments of the oscillator with x and p actuation and integral feedback with the feedback coefficients specified in section 4.1.2 is given by

$\begin{align}\hfill \mathrm{d}\langle x\rangle \left(t\right)=& \frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle x\rangle \left(t\right)-{x}_{g}\left(t\right)\right)\mathrm{d}t+{\alpha }_{\mathrm{i}\mathrm{2}}\mathcal{J}\left(t\right)\mathrm{d}t+2\sqrt{\eta k}{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\hfill \end{align} \tag{ C3a }$

$\begin{align}\hfill \mathrm{d}\langle p\rangle \left(t\right)=& -m{\omega }^{2}\langle x\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t+{\alpha }_{\mathrm{i}\mathrm{1}}\mathcal{J}\left(t\right)\mathrm{d}t+2\sqrt{\eta k}{C}_{xp}\left(t\right)\mathrm{d}W\left(t\right).\hfill \end{align} \tag{ C3b }$

A.4. First moments: x control, proportional feedback

The evolution of the first moments of the oscillator with x actuation only and proportional feedback with the feedback coefficients specified in section 4.2.1 is given by

$\begin{align}\hfill \mathrm{d}\langle x\rangle \left(t\right)=& \frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \langle x\rangle \left(t\right)\mathrm{d}t+2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\hfill \end{align} \tag{ C4a }$

$\begin{align}\hfill \mathrm{d}\langle p\rangle \left(t\right)=& -m{\omega }^{2}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t+4k\eta {V}_{x}\left(t\right)m\omega \left(\langle x\rangle \left(t-\frac{T}{4}\right)-{x}_{g}\left(t-\frac{T}{4}\right)\right)\mathrm{d}t\hfill \\ \hfill & \quad +2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right)\hfill \end{align} \tag{ C4b }$

$\begin{align}\hfill & \approx -m{\omega }^{2}\langle x\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t-4k\eta {V}_{x}\left(t\right)m\omega \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t\hfill \\ \hfill & \quad +2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right).\hfill \end{align} \tag{ C4c }$

A.5. First moments: x control, integral feedback

The evolution of the first moments of the oscillator with x actuation only and integral feedback with the feedback coefficients specified in section 4.2.2 is given by

$\begin{equation}\begin{aligned}\hfill \mathrm{d}\langle x\rangle \left(t\right)=& \frac{1}{m}\langle p\rangle \left(t\right)\mathrm{d}t-\gamma \langle x\rangle \left(t\right)\mathrm{d}t+2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right),\hfill \\ \hfill \mathrm{d}\langle p\rangle \left(t\right)=& -m{\omega }^{2}\langle x\rangle \left(t\right)\mathrm{d}t-\gamma \left(\langle p\rangle \left(t\right)-{p}_{g}\left(t\right)\right)\mathrm{d}t-4k\eta {V}_{x}\left(t\right)\mathcal{J}\left(t\right)\mathrm{d}t+2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{d}W\left(t\right).\hfill \end{aligned}\end{equation} \tag{ C5 }$

Appendix D.: Quality of the period-averaging approximation

Here we evaluate the quality of the period-averaging approximation used in section 4 of the main text. Consider the deterministic evolution of the oscillator mean deviations $\tilde {X}\left(t\right)$ and $\tilde {P}\left(t\right)$ in equation (26), and its period-averaged approximation in equation (27). In figure 8 we plot the evolution of the oscillator means in the rotating frame, X(t) and P(t), under the exact dynamical equation and its period-averaged approximation for oscillator parameters m = ω = N = 1, κ = γ = 1/50, η = 0.4, initial conditions X₀ = P₀ = 10, and target values X_g = 6, P_g = 4. We see that there is very good agreement between the exact and approximate evolution.

**Figure 8.** Evolution of X and P quadratures in the rotating frame of an oscillator subject to x and p proportional feedback controls, under the exact evolution (solid, black lines) and the period-averaged evolution (dashed, colored lines). The system parameters are m = ω = N = 1, κ = γ = 1/50, η = 0.4, initial conditions X₀ = P₀ = 10, and target values X_g = 6, P_g = 4. The inset is a zoom into the early time scale when the deviation between exact and approximate evolution is greatest.
Download figure:
Standard image High-resolution image

Appendix E.: Steady-state compensation for harmonic oscillator with x actuation only

As stated in the main text, under x actuation only, time-delayed proportional feedback results in the following evolution equations for the deviations in the rotating frame:

$\begin{align}\hfill \mathrm{d}\tilde {X}\left(t\right)\approx & 4k\eta {V}_{x}\left(t\right)\left[-m\omega \tilde {X}\left(t\right)\mathrm{sin}\left(\omega t\right)+\tilde {P}\left(t\right)\mathrm{cos}\left(\omega t\right)\right]\mathrm{sin}\left(\omega t\right)/m\omega \enspace \mathrm{d}t-\gamma \tilde {X}\left(t\right)\mathrm{d}t\hfill \\ \hfill & +\left[-\gamma {X}_{g}\enspace {\mathrm{cos}}^{2}\left(\omega t\right)-\gamma {P}_{g}\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)/m\omega \right]\mathrm{d}t\hfill \\ \hfill & +2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\mathrm{cos}\left(\omega t\right)-2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right)\mathrm{sin}\left(\omega t\right)/m\omega \hfill \end{align} \tag{ E1a }$

$\begin{align}\hfill \qquad \approx & \left[-2k\eta {V}_{x}\left(t\right)\tilde {X}\left(t\right)-\gamma \tilde {X}\left(t\right)-\frac{\gamma }{2}{X}_{g}\right]\mathrm{d}t\hfill \\ \hfill & +\left(2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \right)\mathrm{d}W\left(t\right)+2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\hfill \end{align} \tag{ E1b }$

$\begin{align}\hfill \mathrm{d}\tilde {P}\left(t\right)\approx & 4k\eta {V}_{x}\left(t\right)\left[-\tilde {P}\left(t\right)\mathrm{cos}\left(\omega t\right)+m\omega \tilde {X}\left(t\right)\mathrm{sin}\left(\omega t\right)\right]\mathrm{cos}\left(\omega t\right)\mathrm{d}t-\gamma \tilde {P}\left(t\right)\mathrm{d}t\hfill \\ \hfill & +\left[-\gamma {P}_{g}\enspace {\mathrm{sin}}^{2}\left(\omega t\right)-\gamma m\omega {X}_{g}\enspace \mathrm{sin}\left(\omega t\right)\mathrm{cos}\left(\omega t\right)\right]\mathrm{d}t\hfill \\ \hfill & +2m\omega \sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{d}W\left(t\right)\mathrm{sin}\left(\omega t\right)+2\sqrt{k\eta }\left({C}_{xp}\left(t\right)\mathrm{d}W\left(t\right)-m\omega {V}_{x}\left(t\right)\mathrm{d}W\left(t-\frac{T}{4}\right)\right)\mathrm{cos}\left(\omega t\right)\hfill \end{align} \tag{ E1c }$

$\begin{align}\hfill \qquad \approx & \left[-2k\eta {V}_{x}\left(t\right)\tilde {P}\left(t\right)-\gamma \tilde {P}\left(t\right)-\frac{\gamma }{2}{P}_{g}\right]\mathrm{d}t\hfill \\ \hfill & +\left(2m\omega \sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\right)\mathrm{d}W\left(t\right)-2\sqrt{k\eta }m\omega {V}_{x}\left(t\right)\enspace \mathrm{cos}\left(\omega t\right)\mathrm{d}W\left(t-\frac{T}{4}\right),\hfill \end{align} \tag{ E1d }$

where in the second line of each equation we have applied the period-averaging approximation of section 4 to the deterministic terms, and regrouped the stochastic terms.

We will show below that this system is driven to a steady state with ensemble average quadrature values (where the ensemble average is taken over many trajectories) given by $\mathbb{E}\left[\langle X\rangle \left(t\to \infty \right)\right]=\alpha {X}_{g}$ and $\mathbb{E}\left[\langle P\rangle \left(t\to \infty \right)\right]=\beta {P}_{g}$ , with α < 1 and β < 1.

Inspection of equation (E1) shows that one can correct this incorrect ensemble average steady state of the evolution by scaling the target quadrature mean values X_g and P_g to compensate for α, β, if these two coefficients can be determined. To do this, we write the solution of equation (E1) under the period-averaging approximation in matrix form as

$\begin{equation}Z\left(t\right)={\mathrm{e}}^{a\left(t\right)}Z\left(0\right)+{\int }_{0}^{t}\mathrm{d}\tau \enspace {\mathrm{e}}^{a\left(t-\tau \right)}b\left(\tau \right)+{\int }_{0}^{t}\mathrm{d}W\left(\tau \right){\mathrm{e}}^{a\left(t-\tau \right)}c\left(\tau \right)+{\int }_{T/4}^{t}\mathrm{d}W\left(\tau -T/4\right){\mathrm{e}}^{a\left(t-\tau \right)}\enspace \mathrm{d}\left(\tau \right),\end{equation} \tag{ E2 }$

with

$\begin{align*}\hfill a\left(t\right)& =-\gamma t-k\eta {\int }_{0}^{t}\mathrm{d}\tau \enspace {V}_{x}\left(\tau \right),\hfill & \hfill b\left(t\right)& =-\gamma {\left[{X}_{g},{P}_{g}\right]}^{\mathsf{T}},\hfill \\ \hfill c\left(t\right)& =\left[\begin{matrix}\hfill 2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)-2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{sin}\left(\omega t\right)/m\omega \hfill \\ \hfill 2m\omega \sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)+2\sqrt{k\eta }{C}_{xp}\left(t\right)\mathrm{cos}\left(\omega t\right)\hfill \end{matrix}\right],\hfill & \hfill d\left(t\right)& =\left[\begin{matrix}\hfill 2\sqrt{k\eta }{V}_{x}\left(t\right)\mathrm{sin}\left(\omega t\right)\hfill \\ \hfill -2\sqrt{k\eta }m\omega {V}_{x}\left(t\right)\mathrm{cos}\left(\omega t\right)\hfill \end{matrix}\right].\hfill \end{align*}$

The first term is exponentially decaying to zero. The second term provides a deterministic offset from zero at long times, which is exactly what leads to the α, β scaling factors in the steady state. The third and fourth terms generate fluctuations on all trajectories. However, since e^a(t−τ) c(τ) and e^a(t−τ) d(τ) are non-anticipating functions (they are independent of the Wiener process), both of these terms will be zero when the expectation value over different measurement realizations are taken. Therefore, we can solve for the ensemble average steady state by dropping the stochastic terms and evaluating the t → ∞ value of equation (E2) (or equivalently dropping the stochastic terms from equations (E1b) and (E1d) and solving for the steady state). Doing this yields $\alpha =\beta \approx \left(2k\eta {V}_{x}^{\text{ss}}+\gamma /2\right)/\left(2k\eta {V}_{x}^{\text{ss}}+\gamma \right)$ , where ${V}_{x}^{\text{ss}}$ is the steady state of this second moment. We note that this expression for α and β is approximate, because we have solved for the steady state from equation (E2) which was derived under the period-averaged evolution and we also assumed that ⟨x⟩(t − T/4) ≈ ⟨p⟩(t)/mω in formulating our control law. However, both of these approximations are very well justified in the γ, κ ≪ mω limit, so that the corresponding expressions provide excellent estimates of the average steady state for the trajectory. Knowing the values of α, β(=α), we can then compensate for the thermal damping by setting ${X}_{g}={X}_{g}^{\text{true}}/\alpha$ and ${P}_{g}={P}_{g}^{\text{true}}/\alpha$ , where ${X}_{g}^{\text{true}}/{P}_{g}^{\text{true}}$ are the true target values of the quadrature means¹. Note that this implies a similar rescaling of the laboratory frame target values, i.e., ${x}_{g}={x}_{g}^{\text{true}}/\alpha$ , ${p}_{g}={p}_{g}^{\text{true}}/\alpha$ (figure 9).

**Figure 9.** Evolution of X and P quadratures in the rotating frame of an oscillator subject to an x feedback Hamiltonian alone (representative individual trajectories). The parameters of the oscillator are as follows: m = ω = N = 1, η = 0.4, k = γ = mω²/(50). The initial state is set to ⟨X⟩ = 10, ⟨P/mω⟩ = 10 and the target values are X_g = 6, P_g = 4mω (marked by dotted lines in both panels). For these simulations we used dt = T/500 = 0.0126. (a) Proportional feedback control, simulated by equation (31) with time delay τ_P = T/4. Maximum standard deviation of ⟨X⟩ and ⟨P/mω⟩ in steady state is 0.2420. (b) Integral feedback control, simulated by equation (34) with τ_I' = T/2. Maximum standard deviation of ⟨X⟩ and ⟨P/mω⟩ in steady state is 0.2395. The steady state compensation in terms of the α, β discussed in the text is incorporated into both of these simulations, with α = β ≈ 0.7434.
Download figure:
Standard image High-resolution image

Appendix F.: Harmonic oscillator stabilization: effect of time delays on P feedback strategies

For the harmonic oscillator state stabilization example presented in the main text, we derived effective P feedback strategies in the case of x and p actuation, and of x actuation only. In the former case, the P feedback strategy required zero time delay, τ_P = 0, while in the latter, formulating a momentum estimate required a time delay of τ_P = T/4.

Given that any real feedback loop will have some time delay, and that sometimes it is difficult to make this delay small compared to the natural timescales of the system being controlled, we study the impact of larger-than-desired time delays on the P feedback strategies in this appendix, to examine their robustness with respect to variations in τ_P.

F.1. x and p control

In the case where x and p actuation is available, figure 5(a) of the main text shows that the ideal P feedback strategy with τ_P = 0 achieves deterministic and exponential convergence of the quadrature expectations to their target values. In figure 10 we show the behavior of the quadrature expectations for finite delay times, τ_P > 0. The trajectories are very different from the case of τ_P = 0, showing increasing noise as τ_P increases. This is expected, since with a finite time delay we no longer exactly cancel the measurement-induced fluctuations. While the ensemble average of the trajectories still converges to the target state (left panels of figure 10), albeit at a slower rate than for τ_P = 0, individual trajectories fluctuate around the target quadrature values. Thus with τ_P > 0 the long-time behavior of the quadrature expectations has zero bias from the target values (i.e., $\mathbb{E}\langle X\rangle \left(t\to \infty \right)={X}_{g}$ and $\mathbb{E}\langle P\rangle \left(t\to \infty \right)={P}_{g}$ ), but non-zero variance.

**Figure 10.** The effect of time delays on the P feedback law when x and p actuation are available (section 4.1.1). The subfigures show three values for the time delay. The left panel in each subfigure shows the ensemble average of the quadrature expectations, $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ over 1000 trajectories, and the right panel shows a representative trajectory. The maximum standard deviation of the trajectories at long times are 0.245, 0.397, 0.837, for τ_P = 0.05T, 0.1T, 0.2T, respectively.
Download figure:
Standard image High-resolution image

$\mathbb{E}\langle X\left(t\right)\rangle $ — **Figure 10.** The effect of time delays on the P feedback law when x and p actuation are available (section 4.1.1). The subfigures show three values for the time delay. The left panel in each subfigure shows the ensemble average of the quadrature expectations, $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ over 1000 trajectories, and the right panel shows a representative trajectory. The maximum standard deviation of the trajectories at long times are 0.245, 0.397, 0.837, for τ_P = 0.05T, 0.1T, 0.2T, respectively.
Download figure:
Standard image High-resolution image

The zero bias property of the quadrature expectations from their target values at long times can be proved rigorously. We return to the equations of motion for the quadratures in the presence of finite time delay, equation (25), transform to the rotating frame and then, consistent with averaging over an ensemble of trajectories, drop the stochastic terms to obtain coupled deterministic equations for the deviations $\tilde {X}$ and $\tilde {P}$ . This is all done while retaining a finite value of τ_P. Following the notation of the main text, we then arrive at

$\begin{equation}\dot {Z}\left(t\right)=-\gamma Z\left(t\right)+AZ\left(t-{\tau }_{\mathrm{P}}\right),\end{equation} \tag{ F1 }$

where $Z\left(t\right)={\left[\tilde {X}\left(t\right)-{X}_{g},\tilde {P}\left(t\right)-{P}_{g}\right]}^{\mathrm{T}}$ and

$\begin{equation}A=-2k\eta \left[\begin{matrix}\hfill {V}_{s}\enspace \mathrm{cos}\left(\omega {\tau }_{\mathrm{P}}\right)-{C}_{s}\enspace \mathrm{sin}\left(\omega {\tau }_{\mathrm{P}}\right)/m\omega \hfill & \hfill -{V}_{s}\enspace \mathrm{sin}\left(\omega {\tau }_{\mathrm{P}}\right)/m\omega -{C}_{s}\enspace \mathrm{cos}\left(\omega {\tau }_{\mathrm{p}}\right)/{\left(m\omega \right)}^{2}\hfill \\ \hfill m\omega {V}_{s}\enspace \mathrm{sin}\left(\omega {\tau }_{\mathrm{P}}\right)+{C}_{s}\enspace \mathrm{cos}\left(\omega {\tau }_{\mathrm{P}}\right)\hfill & \hfill {V}_{s}\enspace \mathrm{cos}\left(\omega {\tau }_{\mathrm{P}}\right)-{C}_{s}\enspace \mathrm{sin}\left(\omega {\tau }_{\mathrm{P}}\right)/m\omega \hfill \end{matrix}\right].\end{equation} \tag{ F2 }$

Note that we have replaced the second moments by their time-independent steady state values since we are going to be considering the long-time behavior of the system; V_x(t) → V_s, V_p(t) → V_s, C_xp(t) → C_s. Consider the Laplace transform of Z(t): $Z\left(s\right)={\int }_{0}^{\infty }\mathrm{d}t\enspace {\text{e}}^{-st}\enspace Z\left(t\right)$ . The final value theorem says:

$\begin{equation}Z\left(t\right)t\to \infty {\to }{\mathrm{lim}}_{s\to 0}\enspace sZ\left(s\right).\end{equation} \tag{ F3 }$

The Laplace transform of equation (F1) is given by

$\begin{equation}\begin{aligned}\hfill & sZ\left(s\right)-Z\left(0\right)=-\gamma Z\left(s\right)+A\enspace {\text{e}}^{-{\tau }_{\mathrm{P}}s}Z\left(s\right)\hfill \\ \hfill {\Rightarrow}& sZ\left(s\right)-Z\left(0\right)=-\frac{\gamma }{s}sZ\left(s\right)+\frac{A\enspace {\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}sZ\left(s\right)\hfill \\ \hfill {\Rightarrow}& sZ\left(s\right)\left(1+\frac{\gamma }{s}-\frac{A\enspace {\text{e}}^{-{\tau }_{\mathrm{P}}}s}{s}\right)=Z\left(0\right)\hfill \\ \hfill {\Rightarrow}& sZ\left(s\right)={\left(I\left(1+\frac{\gamma }{s}\right)-\frac{A\enspace {\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}\right)}^{-1}Z\left(0\right)\equiv {M}^{-1}Z\left(0\right).\hfill \end{aligned}\end{equation} \tag{ F4 }$

Assuming for simplicity that m = ω = 1 (as in the main text), we have

$\begin{equation}M=\left[\begin{matrix}\hfill 1+\frac{\gamma }{s}\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 1+\frac{\gamma }{s}\hfill \end{matrix}\right]+\left[\begin{matrix}\hfill 2k\eta \left({V}_{s}x-{C}_{s}y\right)\frac{{\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}\hfill & \hfill -2k\eta \left({V}_{s}y+{C}_{s}x\right)\frac{{\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}\hfill \\ \hfill 2k\eta \left({V}_{s}y+{C}_{s}x\right)\frac{{\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}\hfill & \hfill 2k\eta \left({V}_{s}x-{C}_{s}y\right)\frac{{\text{e}}^{-{\tau }_{\mathrm{P}}s}}{s}\hfill \end{matrix}\right],\end{equation} \tag{ F5 }$

where x = cos(τ_P) and y = sin(τ_P). M⁻¹ can be explicitly computed and written as

$\begin{equation}\begin{aligned}\hfill {M}^{-1}=& \frac{s}{{\left(s+\gamma +2k\eta \left({V}_{s}x-{C}_{s}y\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\right)}^{2}+{\left(2k\eta \left({V}_{s}y+{C}_{s}x\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\right)}^{2}}\hfill \\ \hfill & {\times}\left[\begin{matrix}\hfill s+\gamma +2k\eta \left({V}_{s}x-{C}_{s}y\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\hfill & \hfill -2k\eta \left({V}_{s}y+{C}_{s}x\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\hfill \\ \hfill 2k\eta \left({V}_{s}y+{C}_{s}x\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\hfill & \hfill s+\gamma +2k\eta \left({V}_{s}x-{C}_{s}y\right){\text{e}}^{-{\tau }_{\mathrm{P}}s}\hfill \end{matrix}\right]\hfill \\ \hfill & \equiv {m}_{\text{inv}}{M}_{\text{inv}}.\hfill \end{aligned}\end{equation} \tag{ F6 }$

It then follows that

$\begin{equation}Z\left(t\to \infty \right)={\mathrm{lim}}_{s\to 0}\enspace sZ\left(s\right)={\mathrm{lim}}_{s\to 0}\enspace {M}^{-1}Z\left(0\right)=0,\end{equation} \tag{ F7 }$

since all matrix elements of M_inv go to constant values as s → 0, while m_inv goes to zero s → 0, so that lim_s→0 M⁻¹ = 0.

F.2. x control only

In the case where only x actuation is available, figure 9(a) shows that the ideal P feedback strategy with τ_P = T/4 achieves exponential convergence of the quadrature expectations to their target values, with a restricted amount of noise on the individual trajectories. In figure 11 we now show the behavior of the quadrature expectations when the time delay is not exactly equal to T/4, i.e., for τ_P = T/4 + with > 0. We see that in this situation the stabilization performance degrades for all values of —the quadrature expectations deviate from their targets in expectations (show a bias) and the fluctuations in individual trajectories increase with . Thus the performance of the time-delayed P feedback strategy with x control only is less robust to deviations from the ideal τ_P value than that of the P feedback strategy with both x and p control.

**Figure 11.** The effect of time delays on the P feedback law when only x actuation is available (section 4.2.1). The subfigures show three values for the time delay τ_P = T/4 + . The left panel in each subfigure shows the ensemble average of the quadrature expectations, $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ over 1000 trajectories, and the right panel shows a representative trajectory. The maximum standard deviation of the trajectories at long times are 0.314, 0.453, 0.863, for = 0.05T, 0.1T, 0.2T, respectively.
Download figure:
Standard image High-resolution image

epsilon — **Figure 11.** The effect of time delays on the P feedback law when only x actuation is available (section 4.2.1). The subfigures show three values for the time delay τ_P = T/4 + . The left panel in each subfigure shows the ensemble average of the quadrature expectations, $\mathbb{E}\langle X\left(t\right)\rangle$ and $\mathbb{E}\langle P\left(t\right)\rangle$ over 1000 trajectories, and the right panel shows a representative trajectory. The maximum standard deviation of the trajectories at long times are 0.314, 0.453, 0.863, for = 0.05T, 0.1T, 0.2T, respectively.
Download figure:
Standard image High-resolution image

Quantum proportional-integral (PI) control

Article metrics

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Formalism

3. Two-qubit entanglement generation

4. Harmonic oscillator state stabilization

4.1. x and p control

4.1.1. Proportional feedback

4.1.2. Integral feedback

4.2. x control only

4.2.1. Proportional control

4.2.2. Integral control

5. Discussion and conclusions

Acknowledgments

Appendix A.: Two-qubit entanglement generation

Appendix B.: SME of harmonic oscillator

Appendix C.: Equations for first and second moments of harmonic oscillator under PI feedback

A.1. Second moments

A.2. First moments: x and p control, proportional feedback

A.3. First moments: x and p control, integral feedback

A.4. First moments: x control, proportional feedback

A.5. First moments: x control, integral feedback

Appendix D.: Quality of the period-averaging approximation

Appendix E.: Steady-state compensation for harmonic oscillator with x actuation only

Appendix F.: Harmonic oscillator stabilization: effect of time delays on P feedback strategies

F.1. x and p control

F.2. x control only

Footnotes

Quantum proportional-integral (PI) control

Article metrics

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Formalism

3. Two-qubit entanglement generation

4. Harmonic oscillator state stabilization

4.1. x and p control

4.1.1. Proportional feedback

4.1.2. Integral feedback

4.2. x control only

4.2.1. Proportional control

4.2.2. Integral control

5. Discussion and conclusions

Acknowledgments

Appendix A.: Two-qubit entanglement generation

Appendix B.: SME of harmonic oscillator

Appendix C.: Equations for first and second moments of harmonic oscillator under PI feedback

A.1. Second moments

A.2. First moments: x and p control, proportional feedback

A.3. First moments: x and p control, integral feedback

A.4. First moments: x control, proportional feedback

A.5. First moments: x control, integral feedback

Appendix D.: Quality of the period-averaging approximation

Appendix E.: Steady-state compensation for harmonic oscillator with x actuation only

Appendix F.: Harmonic oscillator stabilization: effect of time delays on P feedback strategies

F.1. x and p control

F.2. x control only

Footnotes