Quantum computer-enabled receivers for optical communication

Optical communication is the standard for high-bandwidth information transfer in today's digital age. The increasing demand for bandwidth has led to the maturation of coherent transceivers that use phase- and amplitude-modulated optical signals to encode more bits of information per transmitted pulse. Such encoding schemes achieve higher information density, but also require more complicated receivers to discriminate the signaling states. In fact, achieving the ultimate limit of optical communication capacity, especially in the low light regime, requires coherent joint detection of multiple pulses. Despite their superiority, such joint detection receivers are not in widespread use because of the difficulty of constructing them in the optical domain. In this work we describe how optomechanical transduction of phase information from coherent optical pulses to superconducting qubit states followed by the execution of trained short-depth variational quantum circuits can perform joint detection of communication codewords with error probabilities that surpass all classical, individual pulse detection receivers. Importantly, we utilize a model of optomechanical transduction that captures non-idealities such as thermal noise and loss in order to understand the transduction performance necessary to achieve a quantum advantage with such a scheme. We also execute the trained variational circuits on an IBM-Q device with the modeled transduced states as input to demonstrate that a quantum advantage is possible even with current levels of quantum computing hardware noise.

Optical communication is the standard for high-bandwidth information transfer in today's digital age.The increasing demand for bandwidth has led to the maturation of coherent transceivers that use phase-and amplitude-modulated optical signals to encode more bits of information per transmitted pulse.Such encoding schemes achieve higher information density, but also require more complicated receivers to discriminate the signaling states.In fact, achieving the ultimate limit of optical communication capacity, especially in the low light regime, requires coherent joint detection of multiple pulses.Despite their superiority, such joint detection receivers are not in widespread use because of the difficulty of constructing them in the optical domain.In this work we describe how optomechanical transduction of phase information from coherent optical pulses to superconducting qubit states followed by the execution of trained short-depth variational quantum circuits can perform joint detection of communication codewords with error probabilities that surpass all classical, individual pulse detection receivers.Importantly, we utilize a model of optomechanical transduction that captures non-idealities such as thermal noise and loss in order to understand the transduction performance necessary to achieve a quantum advantage with such a scheme.We also execute the trained variational circuits on an IBM-Q device with the modeled transduced states as input to demonstrate that a quantum advantage is possible even with current levels of quantum computing hardware noise.
Quantum transduction is the task of converting quantum information from one carrier to another, with these carriers usually being degrees of freedom (DOF) at different energy scales; e.g., superconducting qubits and optical photons.Traditionally, quantum transduction has been viewed as an element of quantum networking, enabling the connection of distributed quantum computers.However, another perspective is that quantum transduction, when the destination is one or more qubits in a scalable quantum computing platform, is a way to get unknown quantum states into a quantum computer.These states can then be used as input for a quantum computation, and when the transduction source is the electromagnetic (EM) field, this enables universal coherent processing of quantum information encoded in an EM field.With this perspective, quantum transduction coupled with quantum computing creates radically new modalities for sensing and detection of information in an EM field, a concept that we call quantum computational imaging and sensing (QCIS) [1].This term reflects the fact that the computing element and the sensing element are inexorably linked and cannot be separated, and extends the concept of computational imaging and sensing from the classical domain [2,3].We note that a similar concept has been discussed in Ref. [4].
In this work, we define and analyze a particular application of the QCIS concept: quantum receivers for higher rate coherent optical communication.This application is * mnsarov@sandia.govparticularly useful for illustrating the advantage and utility of QCIS because it allows for quantitative comparison of performance against well-understood limits of conventional (classical) receivers.We perform this comparison and identify regimes of quantum advantage (as defined in Section III) in the presence of non-idealities in the quantum transduction and quantum computation stages.
First, we introduce the application domain.Optical communication forms the backbone of the modern information age.Despite the staggering increase in communication bandwidth enabled by optical communication [5], the ever-growing demand for bandwidth has led to the deployment of coherent optical communication networks.Coherent communication utilizes phase and amplitude degrees of freedom of an optical pulse, as opposed to just intensity, to squeeze more (classical) information into each pulse.Through phase and amplitude modulation of laser pulses the transmitter encodes information using one of the coherent states forming the communication constellation, see Fig. 1.The task of the receiver is to identify the transmitted state from the possible states in the constellation.Due to the non-orthogonality of coherent states there is always a finite probability of error associated with this task, even in the absence of nonidealities such as transmission loss and noise.
The ultimate limit to classical communication capacity when using a quantum state constellation is given by the Holevo bound on mutual information between sender and receiver, which is only a function of the quantum states arXiv:2309.15914v2[quant-ph] 20 Sep 2024 received by the receiver [6,7]: with I being the mutual information, ρ = i p i ρ i and S(•) the von Neumann entropy function.Here, {ρ i , p i } are the received quantum states and their prior probabilities.A classical receiver that measures the received pulses one at a time and decodes the resulting classical bits cannot attain the Holevo bound, regardless of the channel code utilized and generally, even if the measurements are performed adaptively.Instead, joint detection receivers (JDRs) that collectively measure multiple pulses and decode the resulting codeword are required to approach the Holevo bound [6][7][8].This fact is sometimes known as the superadditivity of classical-quantum channel capacity [9] since the capacity achieved with collective measurements on a codeword consisting of several pulses exceeds the sum of the capacities achievable by measuring each pulse separately.Despite the superadditivity of classical-quantum capacity, JDRs are not in widespread use since designing and constructing optimal JDRs is challenging.Theoretical progress has been made in this area recently [8,[10][11][12][13], but optical implementations of JDRs remain challenging, with the notable exception of the proof-of-principle implementation in Ref. [14].
In this work, we address this application domain within the framework of QCIS.We utilize a model of optomechanical quantum transduction to transfer information from optical pulses to qubits and design a quantum computation on these qubits to jointly discriminate a transmitted codeword.The transduction model includes sources of noise and loss, and our analysis reveals regimes where this setup can surpass optimal single-pulse receivers.We note that recent work by Delaney et al. [15] considered a very similar problem to that studied in this work, with two notable differences: (i) in the following we will develop a more complete transduction model than that considered by Delaney et al., including the inclusion of realistic noise sources that impact performance, and (ii) the quantum computation step they employed was based on a message-passing decoding algorithm [16], while we demonstrate that codeword states can be discriminated by variational quantum circuits, which could be more suitable for execution on noisy intermediatescale quantum (NISQ) devices and when the transduction is non-ideal.
The remainder of the article is organized as follows.In Section I we detail the model of transduction that we use and how it enables transfer of optical coherent state information to superconducting qubits.In Section II we outline the variational quantum circuit approach for performing quantum computations to discriminate codeword states.Then in Section III we demonstrate the concept with numerical simulations and quantify regimes of quantum advantage.In Section IV we demonstrate a smallscale receiver on a cloud-based IBM-Q device by mim-

Channel Sender Receiver
Decision on which symbol was sent 1. Schematic of coherent communication.The sender prepares one of K coherent states (forming a constellation) corresponding to K possible symbols to communicate -in this example, K = 4, which is known as quadrature phase shift keying (QPSK).In each use of the channel, the prepared state is sent to the receiver.As a result of channel loss and other non-idealities the states arriving at the receiver are distorted.The receiver measures the received pulses and attempts to make a decision about which coherent state was sent.
icking the input states that would result from the optomechanical transduction model.Finally, in Section V we conclude with a discussion of the results and future work.

I. TRANSDUCTION OF COHERENT STATES
We base our analysis on arguably the most mature deterministic quantum transduction platform: optical to microwave frequency transduction through optomechanical systems.Several variations of this transduction mechanism have been demonstrated [17][18][19][20][21][22][23][24].We begin with the theoretical model for this transduction platform presented by Tian and Wang [25].The goal will be to transfer an arriving optical coherent state (assumed to be in a well-defined spatial mode) into the optical cavity, and then transfer information about that coherent state (particularly the phase, which often encodes the classical information being transmitted) into the microwave cavity mode.Then the interaction between this microwave mode and a superconducting qubit will be engineered to transfer phase information into the qubit state.
The Hamiltonian describing this model is: where b i for i = 1, 2, 3 are annihilation operators for the principal mode of the optical cavity, mechanical oscillator and microwave cavity, respectively, and The first line describes the free Hamiltonians for all DOF, with ω 1 > ω 3 > ω 2 being the relevant energy scales.The second line describes the coupling between elements.The first two terms describe coupling of both of the cavities to the mechanical oscillator through the standard optomechanical coupling (the occupation of either cavity displaces the mechanical mode through radiation pressure forces).The third term describes the coupling between the microwave mode and qubit, described through a Jaynes-Cummings interaction since we assume the mode and qubit are close to resonance (ω 3 ≈ Ω q ).In the following, we assume that all the couplings, g 1 , g 3 , χ, are tunable.
In addition to this Hamiltonian, to model the system dynamics we need to include dissipative terms.As a result, the evolution of the density matrix for the combined four DOF system, ϱ(t), follows the master equation with L j model loss from each of the cavities at rates κ j , and D models thermal damping and excitation of the mechanical oscillator.γ is the coupling rate to the thermal reservoir and n = 1 /(e ℏω 2 /k B T −1) is the average occupation of the temperature T reservoir at the mechanical oscillation frequency.We assume that the qubit decoherence is negligible during the transduction process.
The parameters entering this model vary according to the physical realization of the optomechanical system.In this work we base our analysis on the devices reported in Refs.[17,18], and list the fiducial values for the parameters used in simulations and analysis in Appendix A.

A. Linearization through parameteric driving
The first step in using the dynamics of the system to transduce coherent states from the optical domain to the microwave frequency qubit is to linearize the optomechanical interactions [25].In order to do this, we add coherent drives to both the optical and microwave cavities at frequencies ν 1 , ν 3 : The drive frequencies are red-detuned from both cavities by approximately the frequency of the mechanical oscillator; i.e., ν j = ω j −∆ j , with ∆ j ≈ ω 2 .The Hamiltonian for the system in a rotating frame with respect to these drives is: where Assuming the qubit coupling χ is turned off, the Heisenberg equations of motion for the mode operators, including the dissipative dynamics in Eq. (3), are: The driving and dissipation result in steady-states of these modes, which are given by: , j = 1, 3 where is the steady-state of the operator q 2 .We expand the mode operators around these steady states, e.g., b j = B j + δb j , defining new annihilation operators δb j that quantize fluctuations around the steady-states for each DOF.Writing the equations of motion, Eq. ( 7), in terms of these expansions and neglecting terms quadratic or higher in δb j yields a linearized approximation to the evolution: which is generated by the linearized Hamiltonian (in the rotating frame), and the same dissipative generators as in Eq. ( 3) (redefined by replacing each b j with δb j ).
To obtain the final form of the linearized interaction Hamiltonian we will assume that |g j B j | ≪ ω 2 , and hence drop the counter-rotating terms in the interactions above, i.e., (B * , to get a beam-splitter interaction between the shifted modes. As a result of the parameteric drive, we have a beamsplitter interaction between all three (shifted) oscillator degrees of freedom, which is ideal for state transfer.In addition, note that (i) the shifted modes acquire a frequency shift ∆ j − g j Q 2 that is a priori calculable given knowledge of the system parameters, and (ii) the coupling between modes has been modified from the pure optomechanical coupling rate g j to G j ≡ g j |B j |.This is typically an enhancement since in good operating regimes, |B j | > 1.Finally, since the linearized Hamiltonian is in terms of the shifted operators δb j , it is important to note that the states we will consider are states that sit atop the displaced vacuum for all three DOFs: |B j ⟩ for j = 1, 2, 3.

B. Optical-to-microwave transduction protocol
Given the linearized model derived above, there are several possible protocols for transferring optical coherent states to the microwave cavity, including sequential swapping of states from a cavity to the resonator and vice versa [25], adiabatic transfer [26], and a hybrid version of these [27].In the following, we will base our analysis on the sequential swap protocol, which sequentially transfers the incoming coherent state pulse into the optical cavity, optical cavity to mechanical oscillator, and then mechanical oscillator to microwave cavity by tuning g j .This protocol is essentially the same as the one developed by Tian and Wang [25] and is summarized in Appendix B.
The first step in this protocol inputs the arriving coherent state pulse |α⟩ into the optical cavity.Here, α ∈ C incorporates any channel loss between the sender and receiver.The carrier frequency of the incoming pulse is resonant with the optical cavity, and the Heisenberg equation of motion describing the dynamics of the cavity mode in a frame rotating at ω 1 is: where b in (t) is the input field.This linear interaction between the cavity mode and the input field means that an input coherent state is transferred into a coherent state in the cavity, and solving for the expectation of the cavity mode yields: where we have assumed the cavity is unoccupied (vacuum) at t = 0. We consider a coherent state input, ⟨b in (t)⟩ = αf (t), where f (t) is the pulse profile.For a square pulse of temporal width T and a constant inputoutput coupling, κ 1 , the cavity mode at time ) .Thus the coherent state is transferred into the optical cavity after suffering some insertion loss.The factor η in = 2(1−e −κ 1 T /2 ) / √ κ1 (< 1 in practice) quantifies the insertion, or input, efficiency.This efficiency can be tuned, and made close to unity, by either engineering the pulse profile [28] or by dynamically tuning the input-output coupling κ 1 [29].In addition, any traditional pulse-by-pulse receiver will also incur some insertion loss (that is similarly tunable by a variety of techniques).As a result of this tunability, and in order to compare the unique features of transductionbased receivers to traditional optical receivers, we ignore this insertion loss, and simply assume that a coherent state |β = α⟩ is initialized in the optical cavity at a known time.Our further analysis will look at the effect of |β| 2 or received mean photon number (RMPN) on JDR performance.When comparing to traditional optical receivers we assume they suffer no insertion loss as well, and have access to the same RMPN.
1.8 0.924 T = 1mK 0.001 0.924 Since the initial state in the optical cavity is Gaussian and the transfer dynamics are linear (governed by a quadratic Hamiltonian and linear dissipative operators), we know the transferred state in the microwave cavity is also Gaussian.In fact, Wang and Clerk have derived the analytic form of the transferred Gaussian state under the linearized model [27,Appendix B].Given an initial coherent state |β⟩ in the optical cavity, a thermal state with thermal occupation n0 of the mechanical oscillator (i.e., we do not assume perfect ground state cooling of the oscillator) and the vacuum state in the microwave cavity, the state transferred to the microwave cavity by the protocol is a displaced thermal state ϱ(n tr , This is a Gaussian state with (quadrature) mean and covariance matrix and V = (n tr + 1 /2)I 2 , respectively.Therefore, the transduction from the optical cavity to the microwave cavity results in attenuation and heating of the input state.The attenuation and heating parameters, η tr and ntr , are expressed and plotted in terms of the physical model parameters in Appendix B, and Table I shows these parameters for the fiducial device parameters considered in Appendix A at two operating temperatures.
While the device parameters we use to arrive at the loss and heating rates in Table I are optimistic, they are not out of reach of modern experimental platforms.Experimental demonstrations of optomechanical transducers have operated at temperatures ranging from 4K [17] to as low as 7mK [21,22], with the latter achieved through operation inside optical-access dilution refrigerators.In a continuously operated electro-optomechanical transducer, bidirectional conversion efficiencies as high as 47% have been achieved with 3.2 input-referred added noise photons in the upconversion process (microwaveto-optical) [23].Alternatively, input-referred added noise levels as low as 0.16 photons in upconversion have been obtained in a pulsed nonlinear electro-optic platform with 8.7% bidirectional efficiency [24].The gap between these efficiencies and heating rates can be attributed to two factors.Firstly, our model ignores technical sources of imperfection, such as laser noise, optical pump absorption heating, and intrinsic materials loss, which can be minimized in principle.Secondly, as explained in Appendix A, to be within the strongly coupled regime that enables high-fidelity optical-to-microwave transduction [27], we assume much smaller optical and microwave cav-ity linewidths (κ 1 , κ 3 ) than demonstrated thus far in optomechanical devices.However, we note that such narrow linewidth cavities are well within reach of modern devices [30,31].
The state in the microwave cavity after the cavity transduction steps is Eq. ( 12).However, note that the signal we care about is only a small part of the displacement B 3 + √ η tr β, since typically |B 3 | ≫ |β|.We can compensate for this steady state field offset, which recall is necessary to linearize the optomechanical interactions, by applying a compensation drive to the microwave cavity after the transduction step.Specifically, the expectation of the microwave cavity annihilation operator at time t after a drive a in is applied is [32] ⟨b The initial state is ⟨b 3 (0)⟩ = B 3 + √ η tr β, and choosing the drive to be time-dependent and of the form ⟨a in (τ )⟩ = −e − κ 3 τ /2 B3 √ κ3 , yields at time t = 1, ⟨b 3 (t = 1)⟩ = e − κ 3 2 √ η tr β.Therefore, the offset field can be removed at the cost of additional loss, η tr → e −κ3 η tr .We assume κ 3 is tunable and can be made arbitrarily small during this compensating drive (at the cost of increasing the drive power), and therefore ignore this additional loss in the following.

C. Microwave mode to qubit transduction
The final step in the transduction chain is to transfer information about β to the qubit coupled to the microwave cavity mode.For this step we assume χ, the mode-qubit coupling, is turned on while G 1 = G 3 = 0. We also assume χ ≫ κ 3 , that the qubit is brought in resonance with the microwave mode, and that the coherent coupling between mode and qubit dominates any qubit decoherence as well.In this case we simply have evolution under the Jaynes-Cummings (JC) Hamiltonian , of an initial state ρ f (0) ⊗ ρ q (0).We set the qubit initial state to be the ground state, ρ q (0) = |g⟩⟨g|, and the JC interaction time can be optimized to maximize distinguishability between the transduced qubit states.This optimal time will have a complicated dependence on ntr , η tr , β and χ, and in Appendix C we explore this dependence and other aspects of the JC dynamics.
As an example, consider binary phase shift keying (BPSK) signaling, where the two possible initial states of the microwave mode are ρ f (0) = ϱ(n tr , ± √ η tr β).In Fig. 2 we plot the two transduced qubit states on the Bloch sphere after numerical optimization of the JC interaction time for two values RMPN and the two operating temperatures considered (Table I).These figures illustrate that the field rotates the initial qubit state towards the X − Y plane on the Bloch sphere, with the amount of rotation dictated by the average number of photons in the transduced microwave mode, which is a function of the RMPN and transduction loss.Furthermore, transduction-induced heating reduces distinguishability of the qubits by making the states more mixed.Given a large enough RMPN, e.g., |β| 2 = 4, and low transduction-induced heating, the rotation induced by the two possible phases can result in nearly orthogonal states (Fig. 2(b), blue arrows).The distinguishing feature of the BPSK states, i.e., the phase arg(β), imprints itself onto the azimuthal phase of the qubit as i arg(β * ), yielding transduced qubit states with opposite phases.See Appendix C for more details.

II. VARIATIONAL CIRCUITS FOR STATE DISCRIMINATION
Once the optical coherent states used for communication have been transduced into qubits via the transduction mechanism detailed in Section I, the task of the receiver becomes that of distinguishing between the possible qubit states.We define a codeword as a collection of received pulses that are each transduced into individual qubits.Ultimately achieving channel capacity requires joint decoding of asymptotically large codewords, but one can surpass classical receiver performance even with finite, small codewords.The quantum computation to distinguish the codewords can be constructed using existing decoding strategies as was done in Refs.[15,33].In contrast, in this work we utilize trained variational quantum circuits to discriminate the possible codewords.This strategy has a number of advantages: (i) it does not require a known decoding strategy, which is especially important when non-idealities of transduction are considered since good decoding strategies for ideal optical coherent state codewords may not port to decoding of imperfect, mixed-state qubit-encoded codewords, (ii) it is more compatible with NISQ computers since the variational ansatz can be chosen according to the circuit depths that can be executed with low error on a given device.
Variational quantum circuits consist of gates with tunable parameters that can be optimized for some task.Typically, the circuits take the form of some ansatz that specifies the circuit structure.In this work we consider variational circuits with the structure illustrated in Fig. 3.The number of tunable layers (and hence variational parameters) can be varied, and we expect to require more layers as the number of codewords to be discriminated increases.Suppose M possible codewords are transmitted using n pulses that are transduced into n qubits (M ≤ 2 n ).We measure a fixed ⌈log 2 (M )⌉ of the qubits at the circuit output in the computational basis and assign one of the resulting bit strings to each codeword.Then the variational cost function that is maximized is where θ are the variational parameters of the circuit U (θ), and p(b i |ρ i ) is the probability of measuring bit string b i that corresponds to the codeword i, when the input to the circuit is ρ i , the n-qubit state that encodes codeword i.This cost function is just the average probability of successful decoding, assuming all codewords are sent with equal probability.We numerically train the variational circuits using a method derived from the qFactor optimizer [34].The cost function is quadratic in gates, which makes the standard update described in Ref. [34] numerically unstable.This is remedied by introducing a regularization factor β as described in Sec.IIIA of that paper.QFactor is applied to optimize over the variational ansatz in Fig. 3 with varying numbers of layers (i.e., circuit depths), but it can also optimize over unitaries as opposed to variational circuits.As a result, we also find system-size unitaries that maximize the cost function without assuming any circuit structure.Provided the optimization is successful, this gives us the largest possible cost function that can be achieved with a quantum circuit, and in Section III we show the performance achieved by such optimized n-qubit unitaries as well as the optimized variational circuits.The chosen mapping between input states and output bit strings ρ i → b i can effect the average error probability if the number of variational parameters in the decoding circuit is small.However, given enough variational parameters, i.e., enough layers in the variational ansatz, the cost function value becomes independent of this mapping since the circuit becomes expressive enough to implement the qubit permutations necessary to optimize J .For small variational circuits this mapping can be optimized along with the circuit to further improve performance, although we do not do this here.Note that while the variational circuit training cost scales exponentially with codeword size, this is a one-off cost.For small codeword sizes (n ≲ 15 − 20) the training can be done numerically once the transduced qubit states are characterized.For larger codeword sizes we envision that the training is done with the quantum computing device itself evaluating the cost function.

III. DEMONSTRATION OF QUANTUM COMPUTER-ENABLED JOINT DETECTION RECEIVERS
In this section we combine the coherent state transduction model and variational quantum processing model to demonstrate a joint detection receiver that exceeds the performance of all "classical" individual pulse receivers.We restrict ourselves to considering BPSK-based coherent communication, but extension to other communication constellations is straightforward.
For BPSK, the received optical states | ± β⟩ result in two possible transduced single qubit states ρ ± .We will study transduction with the fiducial parameters presented in Appendix A and at temperatures T = 1K and T = 1mK, resulting in the transduction heating and loss parameters in Table I.
The achievable capacity for the optimal conventional receiver that decodes using pulse-by-pulse detection is ) bits per pulse, where h 2 is the binary entropy function [8].In contrast, as discussed in the Introduction, the asymptotically achievable capacity, enabled by joint detection of codewords is C ∞ = χ bits per pulse, see Eq. ( 1).Here, the Holevo quantity is computed using the state ensemble per pulse available at the receiver with equal prior probabilities for the symbols in the constellation.This capacity is achievable if one uses an error correction code with decoder based on joint measurement of codewords, e.g., [16], or using a codebook with M = 2 nR random n-bit codewords where each bit is encoded using one of the BPSK symbols.Here, R is the rate of the code and if R < C ∞ and the receiver attempts to optimally discriminate between the M codewords, the probability of decoding error goes to zero as n → ∞.
In the following, we will present the probability of decoding error for varying codeword lengths n, and in all cases, unless otherwise specified, we choose M = 2 n−1 .The recieved n-pulse codeword is transduced into n qubits; e.g., the length three optical codeword Fig. 4 shows the average probability of decoding error, (1 − J ) for codeword sizes n = 3, 4 as a function of the RMPN at the two transduction operating temperatures (T = 1K and T = 1mK).We focus on the region of low RMPN because this is where the greatest benefit from using a JDR is expected.We show the average probability of error achievable using variational circuit ansätze of varying depths, and an optimized n-qubit unitary.The black curve in all figures shows the n Helstrom limit, which is the minimum error probability achievable when performing individual pulse-by-pulse detection in the optical domain with n − 1 received BPSK pulses with the given RMPN (it is n − 1 pulses as opposed to because with the JDR codewords we are communicating n − 1 bits).Specifically, the n Helstrom limit is where n is the RMPN and p H = 1 2 − 1 2 √ 1 − e −4n is the Helstrom bound on error probability per BPSK pulse.Achieving this "classical bound" requires using a Helstrom-bound saturating detector like the Dolinar receiver [35].
Fig. 4, although its only for small codeword sizes, reveals several interesting insights.First, it is clear from Fig. 4(b) that the amount of noise introduced by transduction at T = 1K renders discriminating the codewords difficult, even in the ideal case with optimized n-qubit untaries.The JDR (1 − J ) does not reduce below the relevant Helstrom limits for any values of RMPN.In contrast, for low temperature transduction, Fig. 4(a), there is a significant region of RMPN where the (1 − J ) achieved by the JDRs improves over the Helstrom limit.Increasing the variational circuit depth allows reduction of (1−J ), however, for n = 3(4), the average error probability almost saturates to values achievable with full unitary optimization by L = 3(4) already.
In Fig. 5 we show how the average probability of error behaves (as a function of RMPN), for larger codeword sizes, n = 5−8, each with M = 2 n−1 signaling states.We only show the low temperature transduction cases since as in the n = 3, 4 cases the JDR cannot attain a (1 − J ) lower than the Helstrom limit when the transduction is performed at T = 1K.In addition, instead of showing behavior with increasing ansatz layers, for simplicity we show the (1 − J ) achieved by the optimized n-qubit unitaries.For these larger codewords, we see again that there is a significant region of RMPN where the average error in decoding is less than the relevant Helstrom limit, indicating a quantum advantage.
To see how the error probability changes with increasing codeword size, in Fig. 6 we show the (1−J ) as a function of RMPN for various n with M = 4 fixed (for transduction at 1mK).The average error probability decreases with increasing codeword size (i.e., as the rate of the code decreases) and it does so appreciably in the region with the greatest quantum advantage, ∼ 0.05 < RMPN < 0.5.
Finally, to understand the asymptotic advantage provided by our JDR consisting of optomechanical transduction and variational quantum computation, we compare various BPSK capacities (per pulse) in Fig. 7.We plot the capacity for our JDR in the ideal transduction case and in the low temperature transduction case.Both provide an improvement over the individual detection receiver capacity (C 1 ), by almost an order of magnitude in the very low RMPN regime.For comparison we also show the BPSK capacity of the JDR proposed in Delaney et al. [15] that proceeds by probabilistic transduction into trapped ions.Our JDR in the ideal limit achieves the same capacity as that of Delaney et al. at most RMPNs, but is slightly below that capacity when heating and loss of low temperature transduction is factored in.This is not surprising since the transduction model in Delaney et al. does not incorporate non-idealities such as thermal noise and loss.Notably, our JDR overcomes the dip in capacity at large RMPNs of the receiver in Ref. [15], which is caused by heralding a successful transduction on measuring zero photons, and remains close to the Holevo bound on capacity at all RMPNs.The vertical lines in Fig. 7 show the typical RMPN for various space communication links, as calculated in Ref. [15].

IV. EXPERIMENTAL DEMONSTRATION OF VARIATIONAL CIRCUIT DECODING
In this section we aim to demonstrate the robustness of the quantum computer enabled receiver concept to experimental noise.The transduction model accounts for some of the noise in the transduction physics, including loss and thermal effects, and in this section we account for the noise in the quantum computation.There are some fundamental reasons to expect some robustness to noise.The first is that this application of quantum computers does not rely on scale of computation (in terms of number of qubits or circuit depth) to achieve a quantum advantage -instead, the quantum computer is enabling a measurement that is not possible in the classical regime.Second, the aim of the computation is not an exact answer but rather a reduction in average probability of error and thus could be more tolerant to error in the circuit implementation.
To assess the robustness we implement the trained variational circuit for codeword size n = 3 and transduction at T = 1mK from Section III on the cloud- shows results for transduction at high temperature (1K).In all plots, the black curves are the relevant Helstrom limits that capture the best pulse-by-pulse receiver performance, as explained in the main text.The dots show the performance of the quantum computer-based JDR for varying circuit depths -L is the number of layers of the variational ansatz in Fig. 3 that are optimized.The "Optimized U" dots correspond to (1 − J ) achievable by optimized n-qubit unitaries.based IBM device ibm_algiers.The initial states to the circuit are prepared using a layer of single qubit gates at the beginning of the circuit.Since the initial states -the codeword states after transduction at 1mK -are mixed, for this demonstration we decompose these mixed states into pure state ensembles that are then operated on by the circuit, and then the circuit results are The solid curve is the Helstrom limit that captures the best pulse-by-pulse receiver performance when decoding two pulses (transmitting 2 bits).The inset shows the trend of exponentially decreasing average probability of error with increasing codeword size, n, at various values of RMPN.
combined to compute the average probability of error; i.e., ⟨b i |E(ρ , where λ i and |ψ i ⟩ are the eigenvalues and eigenvectors of ρ.The average error probability was estimated using 8192 executions of the circuit (shots) per input eigenstate.Fig. 8 shows the probability of error achieved by the experimental circuit implementation as a function of RMPN for variational circuits with L = 1 − 3, with the theoretical calculations and the classical limit for comparison (the latter quantities are the same as in Fig. 4(a)).While the experimental (1 − J ) values are consistently greater than the theoretical ones, remarkably the probability of error is lower than the classical n = 3 Helstrom limit for a large range of RMPN values.In Table II we show the device parameters reported by IBM at the time of the experiments.The parameters reveal a conventional contemporary superconducting qubit processor, with average fidelity and coherence characteristics.Thus, even with current quantum computer gate and qubit qualities, the advantage presented by a JDR can be realized if the transduction fidelity is reasonably high.
Fig. 8 illustrates the robustness of the JDR predictions to hardware noise.However, it should be noted that the longest circuit executed, with L = 3 layers, contains only 6 CNOT gates.This is sufficient to minimize average error probability in this case (as shown in Fig. 4 the error probabilities for L = 3 coincide with those achieved by an optimized 3-qubit unitary transformation), but for larger codeword sizes we expect to require more layers and thus more CNOT gates.For example, for n = 4, the L = 4 optimized circuit has 16 CNOT gates.While this is not too large, due to the limited connectivity of the ibm_algiers device the compiled circuit has 52 CNOTs.As a result, we were unable to observe a significant quantum advantage at this codeword size using this device.It is possible that more sophisticated variational optimization techniques that take into account device connectivity and hardware noise, e.g., the approach in Ref. [36], will improve the experimental performance for larger codeword sizes.Finally, we have also simulated the impact of a simple depolarizing model of gate noise on the receiver circuit performance and the results are presented in Appendix D.

V. CONCLUSIONS AND DISCUSSION
We have shown how a JDR for optical communication can be constructed from optomechanical transduction and superconducting quantum information processing devices.The performance of such a JDR depends on the transduction physics, specifically on the thermal noise introduced by the mechanical oscillator and losses in the transduction chain.We predict that operating the optomechanical transducer around T = 1mK, and with complete tunability of the couplings between the mechanical oscillator and optical and microwave cavities, one can achieve the transduction fidelities required to demonstrate a JDR with an advantage over all classical receivers that process the received pulses one at a time.This advantage can be realized with quantum com-puters as small as 3 qubits and with circuits containing as few as 6 CNOT gates.Fundamentally, the advantage arises from the ability to engineer general measurements (positive operator valued measures or POVMs) on the codewords.
In addition to numerical simulations, we implemented the variational circuit-based decoder on an IBM cloudbased quantum computer to demonstrate that even with current levels of hardware noise, if the transduction is high fidelity enough, a quantum computer-enabled JDR can surpass classical bounds on decoding error.The impact of noise can be further minimized by taking noise models into account while performing the variational optimization.
There are several challenges to an end-to-end realization of quantum computer-enabled JDRs.The first is transduction fidelity -as discussed in Section I, stateof-the-art optomechanical transduction efficiencies fall far short of the η tr ∼ 0.9 assumed in our calculations.Given current devices, it is not possible to surpass the performance of state-of-the-art optical receivers, that can achieve detection efficiencies of ∼ 90% [37].The linewidths of the optical and microwave cavities used in transducers need to be reduced, and technical sources of noise must be minimized to achieve transduction fidelities required for achieving a quantum advantage.A second major challenge is the integration of high quality quantum transduction with quantum computers.Despite the progress in quantum transduction in recent years [17-24, 38, 39] such integration has not been demonstrated to our knowledge.However, it should be noted that unlike quantum transduction for quantum networking purposes, for the JDR application we require transduction of weak coherent states and not single photons.Furthermore, the transduction can be unidirectional, optical to microwave frequencies.Both of these aspects have the potential to make transduction for JDRs easier to implement.Finally, while we focused on optomechanical transduction in this work, it would be fruitful to study other mechanisms for transduction of optical coherent states to qubits to base quantum computer-enabled receivers on.
Step 1: g 1 = g 3 = 0. Load all oscillators with their steady states -coherent states with amplitude B j from Eq. ( 8).This can be done by driving and waiting for the system to relax, or more practically, through resonant drives of the two cavities and parameteric displacement of the mechanical oscillator [25].During this process, the optical cavity is also driven by the received input pulse with unknown state |α⟩.At the end of this step, the idealized state of the three DOF is where β = √ η in α and ϱ( m, a) is a thermal state with thermal occupation m displaced to amplitude a.
Step 2: Set g 1 = g max 1 , g 3 = 0 and wait τ 1 = π /2G max 1 , at which point the beam-splitter interaction swaps the state of the shifted oscillators.At the end of this step the idealized state of the system is Step 3: Set g 1 = 0, g 3 = g max 3 and wait τ 3 = π /2G max 3 , to execute a second swap into the microwave cavity.The idealized state of the system after this step is We have shown the idealized states after each step in the transfer protocol above, but these do not take into account the dissipative and heating processes on the three DOF during the transfer time τ 1 + τ 3 .Taking these into account results in a final state in the microwave cavity that is a displaced thermal state, Eq. ( 12) in the main text.The loss and heating parameters in that state are derived in Ref. [27], and in our notation are: where While this loss parameter is easy to interpret, the heating parameter, ntr (interpreted as the number of thermal photons added by the transduction process), has a complicated dependence on the parameters.To illustrate the behavior of ntr , in Fig. 9 we plot it as a function of some of the physical parameters as they are varied from their fiducial values at T = 1K.The overall variation is similar for the lower temperature of T = 1mK.As seen from these plots, the variation of ntr with the physical parameters is mild.
Table I in the main text shows the values of ntr and η tr at the fiducial transduction parameter values given in Appendix A and at the two possible operating temperatures.

Appendix C: Tuning the Jaynes-Cummings interaction
In this appendix we present details of the Jaynes-Cummings (JC) interaction dynamics that form the basis of the transduction from microwave cavity state to qubit state.
Using b 3 (0)|α⟩ = α|α⟩ and σ − (0)|g⟩ = 0, the offdiagonal expectation value can be evaluated to the simplified expression  where the function f n is a summation involving its arguments whose exact form is not important for our analysis below, except for the fact that it satisfies f n ≥ 0, ∀n.Note that the only dependence on arg(β) in this expression is from the β prefactor, and hence we can say that the qubit phase in the X − Y plane is a simple function of the phase of the coherent state, i.e., arg⟨σ + (t)⟩ = i arg(β * ).
The distinguishability of the transduced qubit states is evaluated by computing their trace distance.By simplifying the expression for ⟨σ + (t)σ − (t)⟩ ± using the above observations we can see that this quantity does not depend on arg(β), and therefore the trace distance between the two qubit states transduced from BPSK signals is (C5) Examining this quantity, and approximating the arguments to the two trigonometric functions as the same, it is clear that a strategy for maximizing it is to choose t = t ∼ π 4χ √ N , where N is roughly the number of photons in the state ϱ(n tr , ± √ η tr β).Despite this guide, unfortunately, the optimal time does not have a simple expression.However, we can numerically evaluate it and when this is done we find it has a complex dependence on the parameters and a dependence that is a function of the mean photon number in the microwave cavity, M P N ≡ | √ η tr β| 2 + ntr .If we focus on the short-time regime, 0 ≤ t ≤ 5/χ, then the optimal time takes the form for values of M P N ≥ 1 /4.This form follows that of the optimal time identified by the analytical arguments above, t.However, for values of 0 < M P N ≤ 1 /6, we find that the optimal time is simply In this regime, the fact that the two trigonometric functions in Eq. (C5) have different n dependence ( √ n versus √ n + 1) means that the above argument for maximizing τ is not valid -in this regime the n = 0 term dominates the sum and we should simply maximize sin(χt).In the intermediate regime, 1 /6 < M P N < 1 /4, the optimal time crosses over from Eq. (C7) to Eq. (C6), and this crossover depends on the balance between the coherent photons (| √ η tr β| 2 ) and thermal photons (n tr ) in the state.
Finally, we mention that if we go beyond the shorttime regime, it is possible in some regimes of M P N to obtain maximum trace distance at t ≈ 8/χ [41].We generally do not consider such long time dynamics in this work though, since (i) the gain in trace distance at long times is only slight, and (ii) the short-time regime is most relevant to our application since we want to transduce the information to the qubits as quickly as possible to increase bandwidth and also to minimize the impact of decoherence processes.In almost all of the numerical calculations of the JC interaction and qubit state used in the main text, we optimize the interaction time in the range 0 ≤ t ≤ 5/χ.The exception is in the capacity plot of Fig. 7, where we optimize over a longer time window, 0 ≤ t ≤ 10/χ, to capture the true optimum qubit states.
We conclude this section with a comment on the initial state for the transduction dynamics.We have assumed that the initial state of the qubit is |g⟩ throughout.This is a natural initial state to consider because it is the ground state of the qubit.Moreover, there are reasons to believe this initial state is optimal for transducing phase information from a mode using the JC interaction.For example, if the qubit is initialized in a superposition, the populations will depend on the phase relation between the optical phase and initial qubit phase, and in general the transduced BPSK states have no symmetry on the Bloch sphere.Therefore the distinguishability of the transduced states is not solely determined by properties of the microwave mode.We have also numerically verified that the distinguishability is not increased by choosing a different initial qubit state.
Appendix D: Receiver performance under simulated circuit error model In this appendix, we study the performance of the joint receiver's variational quantum circuit under a simple theoretical noise model.In Fig. 10, we plot the probability of error in decoding under varying levels of noise for the case of 4 3-qubit codewords transduced at T = 1mK with the 3 layer variational circuit used in the main text.The noise model consists of (1) a single-qubit depolarizing channel after every single qubit gate with error probability p 1 , (2) a two-qubit depolarizing channel after every CNOT gate with error probability p 2 , and (3) a measurement error probability of p m .
As expected, the performance degrades as a function of increasing noise, but the quantum advantage persists for intermediate RMPNs even at appreciable levels of depolarizing noise (e.g., 0.1% error on single qubit gates, 1% error on CNOT gates and measurements).This is largely a consequence of the short depth variational circuit required to perform codeword discrimination.

4 FIG. 2 .
FIG. 2. Qubit states resulting from transduction of BPSK states for two values of received mean photon number, with transduction under the fiducial model parameters in Appendix A. The blue arrows correspond to low temperature (T = 1mK, ntr = 0.001) and the red arrows to high temperature (T = 1K, ntr = 1.8) transduction.The dashed red arrows extrapolate from the solid red arrows to the surface of the Bloch sphere and are a visual aid to show the reduction in Bloch vector length.

FIG. 3 .
FIG.3.Four-qubit example of the variational circuit ansatz used in Section III.Alternating layers of CNOT gates between neighboring qubits (assuming cyclic boundary conditions) conjugated by arbitrary single qubit gates are repeated.For four qubits, two time steps are required to execute CNOTs between all neighbors.Therefore, the example above contains two layers of the ansatz.At the end of the circuit ⌈log 2 (M )⌉ qubits (where M is the number of signaling states being distinguished) are measured and the bit string outcomes are assigned to each of the signaling states.The variational parameters are in the single qubit gates, each one having three angle parameters; e.g., for the example shown here, there are 60 variational parameters.

FIG. 4 .
FIG.4.Average probability of error (1 − J ) for decoding M = 2 n−1 n-qubit codewords using optomechanical transduction and trained variational circuits for n = 3, 4. (a) shows results for transduction at low temperature (1mK), and (b) shows results for transduction at high temperature (1K).In all plots, the black curves are the relevant Helstrom limits that capture the best pulse-by-pulse receiver performance, as explained in the main text.The dots show the performance of the quantum computer-based JDR for varying circuit depths -L is the number of layers of the variational ansatz in Fig.3that are optimized.The "Optimized U" dots correspond to (1 − J ) achievable by optimized n-qubit unitaries.

FIG. 5 .
FIG.5.Average probability of error (1−J ) for decoding M = 2 n−1 n-qubit codewords using optomechanical transduction and trained variational circuits for n = 5 − 8.This figure only shows results for low temperature transduction, and the (1 − J ) achieved by ideal, optimized n-qubit unitaries.The solid curves are the relevant Helstrom limits that capture the best pulse-by-pulse receiver performance, as explained in the main text.

16 FIG. 6 .
FIG.6.Average probability of error (1 − J ) for decoding M = 4 n-qubit codewords using optomechanical transduction and trained variational circuits for n = 4 − 10.This figure only shows results for low temperature transduction, and the (1 − J ) achieved by ideal, optimized n-qubit unitaries.The solid curve is the Helstrom limit that captures the best pulse-by-pulse receiver performance when decoding two pulses (transmitting 2 bits).The inset shows the trend of exponentially decreasing average probability of error with increasing codeword size, n, at various values of RMPN.

FIG. 7 .
FIG. 7. BPSK channel capacities per pulse as functions of RMPN.For comparison, we show our JDR performance both with ideal transduction (ntr = 0, ηtr = 1, red dashed curve) and in the low noise regime (T=1mK, blue dotted curve), as well as the capacity of the receiver proposed by Delaney et al. [15].Note that both the ideal and the low-noise JDRs outperform the C1 capacity until the RMPN reaches ∼0.8, a slightly larger range than for the Delaney et al. receiver.

FIG. 8 .
FIG.8.Average probability of error (1 − J ) for decoding M = 4 3-qubit codewords with an optimized variational circuit implemented on ibm_algiers (× markers).L is the number of layers of the variational ansatz in Fig.3that are optimized.The codeword input states were initialized using the model of optomechanical transduction at T = 1mK developed above.The black curve is the relevant Helstrom limit, and the dots are the theoretical, error-free error probabilities for the corresponding circuits.The inset displays a zoomed in view of the region of the plot with the largest quantum advantage and shows the data points with error bars.

FIG. 9 .
FIG. 9.Variation of ntr as physical parameters in the transduction model are varied from their fiducial values given in Appendix A. All parameters except the two varied in each plot are held at their fiducial values.The temperature is assumed to be T = 1K, except for the cases where n is varied.

TABLE I .
Transduction heating and loss parameters for fiducial transduction model parameters in Appendix A at two operating temperatures.

TABLE II .
8, a slightly larger range than for the Delaney et al. receiver.ibm_algiersdeviceparameters for the qubits used to generate data in Fig.8.