The Roles of Kerr nonlinearity in a Bosonic Quantum Neural Network

The emerging technology of quantum neural networks (QNNs) attracts great attention from both the fields of machine learning and quantum physics with the capability to gain quantum advantage from an artificial neural network (ANN) system. Comparing to the classical counterparts, QNNs have been proven to be able to speed up the information processing, enhance the prediction or classification efficiency as well as offer versatile and experimentally friendly platforms. It is well established that Kerr nonlinearity is an indispensable element in a classical ANN, while, in a QNN, the roles of Kerr nonlinearity are not yet fully understood. In this work, we consider a bosonic QNN and investigate both classical (simulating an XOR gate) and quantum (generating Schr\"odinger cat states) tasks to demonstrate that the Kerr nonlinearity not only enables non-trivial tasks but also makes the system more robust to errors.

Introduction.-Biologically inspired artificial neural networks (ANN) have shown great accomplishments in processing information with the ability to break through the von Neumann bottleneck (referring to the delay between processor and memory) [1][2][3][4]. Various types of ANN architectures have been proposed in the field of machine learning, for example, feedforward neural networks [5] and recurrent neural networks [3,[6][7][8][9] that are shown to be powerful in speech/pattern/fingerprint classification [10][11][12][13], financial forecasting [14] and nonlinear series prediction [15]. We normally aim for more complexity from an ANN to obtain richer dynamics. This in turn allows for a better performance for the tasks mentioned above. For example, this can be achieved by considering more layers/nodes [16], more completed architectures [17,18] and stronger nonlinearity [19], all contribute to performance.
Based on these requirements of an ANN, many physical systems have been proposed to be hardware implementation platforms for ANNs, for example: memristors [20], spintronics [21,22], microcavity exciton-polaritons [23][24][25][26], etc. Among these platforms, a nonlinear activation function is an indispensable part for even basic machine learning tasks. The role of the activation function is to determine the output of a specific node given a set of corresponding inputs. This means that if we allow only a linear activation function in an ANN, the outputs will simply be a linear transformation of the inputs (no matter how many network layers one implements). Such an ANN can be represented just by a matrix multiplication corresponding to a simple input-to-output process. Nonlinearity is required from a physical system for hardware implementation for more general transformations.
Recently, quantum neural networks (QNNs) [27][28][29] emerged as promising platforms combining the characteristics of ANN and quantum physics, which aim to gain advantages of quantum mechanics, for example, having more degrees of freedom from the large Hilbert space [30] and quantum correlations between quantum modes [31] in performing either classical or quantum tasks. In this direction, QNNs have been shown to offer speedup in solving classical tasks [32][33][34][35] and improvement in learning efficiency [36][37][38][39], compared to their classical counterparts. On the other hand, quantum tasks such as entanglement recognition [40], phase estimation [41], quantum state preparation [42,43] and quantum state tomography [44] have also been proposed with different kinds of QNNs.
Intuitively, one expects that nonlinearity is also important in a QNN similar to the classical ANN, which requires further demonstration. In this work, we take a bosonic QNN as an example with different architectures and simple nearest neighbour hopping between the quantum network nodes. The architectures shall be kept simple as the aim is to demonstrate how the Kerr nonlinearity plays a role in the QNN for both classical and quantum tasks. We first choose the classical task as simulating an XOR gate, where having nonlinearity in the input-to-output process is necessary. One should note that here both the input signal and output signal are classical. We show that Kerr nonlinearity indeed allows for the XOR gate to function. It is worth noting that this is done without nonlinear elements from anywhere else. The latter can emerge simply from considering different quantities for the inputs and outputs, for example amplitude and intensity, which in turn clouds the roles of Kerr nonlinearity. Moreover, by considering a more realistic situation where the measurement error is unavoidable in practice, we found that the Kerr nonlinearity has an error correction effect for classical tasks. In particular, we consider errors on the measured outputs, and found that stronger Kerr nonlinearity leads to reduction of errors for the XOR gate. Next, we extend this investigation to the quantum regime where we consider a quantum operation generating a corresponding Schrödinger cat state given an arbitrary coherent state. Here the outputs are quantum states. Since a Schrödinger cat is usually characterized by its Wigner function, we choose the cost function to minimize the difference between the Wigner function of the obtained states and target Schrödinger cat states. We show that Kerr nonlinearity plays an essential role that allows this quantum operation and makes the QNN noise/error resistant.
The model. The considered quantum neural network (QNN) consists of n bosonic modes with random nearest neighbour coupling. The Hamiltonian can be expressed as: where we consider an onsite energy E and strength of Kerr nonlinearity α the same for each bosonic mode. Also, J ij represents the nearest neighbour coupling strength between modes i and j. In our scheme, the QNN is evolved following the quantum master equation: whereĤ 0 is the Hamiltonian in Eq. 1 and the Lindblad operator L is defined as The second term on the right-hand side of Eq. 2 determines the decaying process of the modes in the QNN with a rate γ/ . We obtain the density matrix ρ(τ ) and occupation number n i = Tr{â † iâ i ρ(τ )} of each mode at a certain time τ from the evolution of Eq. 2.
Classical Task. Here we consider the classical task of simulating a classical XOR gate with a four bosonic mode QNN; the input and target output relation is shown in Fig. 1(b). The first two modes (â 1,2 ) are the input modes and the last two modes (â 3,4 ) are the output modes (see Fig. 1(a)). The input signals can be introduced as amplitudes of coherent pump or encoded as the occupation numbers for the input modes. After the evolution for a time τ , we measure the occupation numbers on the output modes. We first investigate the case of injecting the input signals via coherent pump, in which the Hamiltoinian in Eq. 1 now reads: where the amplitudes of the coherent pump correspond to the binary input in Fig. 1(b). This form of pumping is consistent with photonic neural network prototypes based on exciton-polaritons [25]. It turns out that without Kerr nonlinearity (αâ †â †ââ ), the QNN can already simulate an XOR gate, i.e., the trained outputs can match the target with negligible error, see the last two columns in the table of Fig. 1(b). This somewhat unexpected result can be explained by noting that the inputs and outputs in the QNN are nonlinearly related. That is, the inputs are encoded as pumping amplitudes while the output as occupation numbers. As occupation number is a nonlinear function of amplitude, the system is capable of nonlinear input-to-output mapping corresponding to the XOR gate. Similar results are also demonstrated in Ref. [45]. Given that the Kerr nonlinearity is not required, we also verified that an evolution of classical states treated within a mean-field model of the system can reproduce the considered XOR gate (see the results in Fig. 2), which also rules out the importance of quantum correlations in this task. Details of simulating the classical neural networks can be found in the Supplementary Materials (SM). Let us also note that if we were to instead consider particle intensities rather than pumping amplitudes as the inputs, the system would no longer be capable of simulating a nonlinear map and the XOR gate would not be possible. While the simple investigation so far suggests that Kerr nonlinearity is not necessary for the QNNs and ANNs, as one might obtain nonlinear elements in the system by other means, we will show below that it does offer an essential element in situations accounting for measurement errors, noise, or more complicated tasks. Let us consider noise introduced on the measured occupation number of each mode. Instead of taking n i as the output, n i + δ i is considered, where δ i is the measurement error on the i-th mode. We consider a uniformly distributed random error δ i = [0, 0.8] × n i . This model is such that the errors are fractions of the corresponding occupation numbers. The error of the trained outputs  2. (a) Input output correspondence when the intensity |ψn| 2 is measured as the outputs for cases when Kerr nonlinearity α is set to be non-zero (Output1) or zero (Output2). (b) Input output correspondence when the wavefunction amplitude ψn is measured as the outputs for cases when Kerr nonlinearity α is set to be non-zero (Output1) or zero (Out-put2).
is defined as the difference between the targets and the actual trained outputs. Note that since in a logic gate, both the inputs and outputs are typically binary digits, the error must be below 0.5 to operate successfully. Now, we encode the input information with occupation numbers in modesâ 1 andâ 2 , which means instead of starting from vacuum state for all inputs, we start with coherent states with average occupations 0/1 in modes 1 and 2 according to the different inputs in Fig. 1(b). In this case, no pumping scheme is considered in the system (set P = 0 in Eq. 2). In Fig. 1(c) (blue curve), we can see the error drops rapidly to insignificant as the nonlinearity increases. However, the error is non-negligible when zero Kerr nonlinear strength is considered (> 0.5). Now, we revisit the scheme where the inputs are encoded in the amplitudes of the coherent pump P 1,2 . In this case, we start the system from vacuum states for all the modes. In Fig. 1(c) (red curve), we can see the error is already insignificant even when α is set to zero. As mentioned above, this is because there is nonlinearity already present, i.e., that between the input and output quantities. One can still see that the error drops with increasing of the strength of Kerr nonlinearity. Intensity encoded input is also consistent with neural networks based on non-resonantly pumped polariton condensates [46], which are considered explicitly in the SM. This concludes that Kerr nonlinearity offers an essential element for QNNs to simulate nonlinear processes such as the XOR gate, in the presence of measurement errors.
Schrödinger cat state generating operation. Here, we consider a quantum task utilising the QNN, that is, generating a Schrödinger cat state from a given coherent state |β . The target Schrödinger cat state is defined as: where the coherent state is expressed as |β = e βâ † −β * â |0 , the coefficient N β,k = 2[1 + (−1) k e −2β 2 ] and |0 denotes the vacuum state.
In this task, we consider five bosonic modes in the QNN and start the system from a coherent state (|β ) for mode 1 and vacuum states (|0 ) for other modes, see the schematic in Fig. 2(a). Each mode is under a coherent pump and the system evolves following Eq. 2 until time τ . The reservoir modes can emit, e.g., photons, which can be recombined through linear mixing to form a set of new bosonic modes The linear mixing is energy conserving and can be done with linear optics elements, i.e., beam splitters and phase shifters [47]. The modes after the linear mixing process can be represented by their annihilation (creation) op- where W ij is the weight matrix that is unitary, following the commutation conditions from the output modes [42]. For a QNN composed of N modes, one can construct at most N output modes with the linear mixing process. In what follows, we will focus on one output mode, while the other N − 1 are assumed to be vacuum, which can be realized with conditional measurements. These are routinely used to generate Fock states in a strongly coupled oscillator-spin system [48], motional Fock states of an atom [49], superposition of pure states [50][51][52], photon added states [53], etc. Conditional measurement is also demonstrated to have improvement on teleportation of continuous variables [54][55][56]. Related experiments based on beam splitters and two photon-number-resolving detectors [57] and silicon photomultipliers [58] have also been realized. We note that apart from linear optics, the linear mixing transformation can be obtained with simple tunable hopping interactions between bosonic modes. An exemplary experimental setup in this case has been demonstrated with two interacting microwave cavities [59].
Since the task is generating a corresponding Schrödinger cat state, we define the Wigner function of the obtained quantum states from the QNN as W O (x, p) and the target Schrödinger cat state as W T (x, p), the error can be defined as: In this case, the incoherent loss in the QNN plays a role similar to noise in the classical task, i.e., a process that reduces the quality of the output. In our simulation, we first start with different specific coherent states, for example, with amplitude β equals 1.4, 1.3, 1.2, 1.1 and 1. The aim here is to generate corresponding Schrödinger cat states from different initial states |β by optimizing the connection weights from the QNN to the new constructed modes. Fig. 2(c) presents a prepared Schrödinger cat state with an error δ = 0.07, while the conditional measurement probability is 0.028. One can compare it with Fig. 2(b) to see that it is well matched with the target state. In Fig. 3, we can also see that with increasing strength of Kerr nonlinearity (α), the error decreases down to ∼ 0.03 for β = 1, which proves that the Kerr nonlinearity is helping the QNN functioning in a noisy environment. We also consider when the strength of Kerr nonlinearity goes to infinity, in which case the bosonic modes are effective fermions. Fig. 3 shows that with infinite Kerr nonlinearity, the QNN still offers low enough error δ to generate the Schrödinger cat states. The conditional measurement probability for infinite Kerr nonlinearity (fermions) to achieve comparable error δ to the results in Fig. 2(c) is 0.085. Additionally, we note that with decreasing values of β, the error decreases as well, which indicates that for smaller amplitude β, the corresponding Schrödinger cat states can be easier to obtain. In this process, we keep the nearest neighbour coupling {J ij } random and fixed. Moreover, for a general assessment, we generate 10 random coherent states with β ∈ [1, 1.4] and use the same optimisation scheme as described before. The average error under different strengths of nonlinearity is presented in Fig. 3(b), which shows similar trend to that in Fig. 3(a). Our results show that in general, i.e., for different coherent states as input, the Kerr nonlinearity allows error resistant production of Schrödinger cat states. We have also considered other quantum tasks, such as single photon state generation, and found the same conclusion (see SM). Conclusion. We demonstrated how the Kerr nonlinearity functions in quantum machine learning with a random nearest neighbour coupling bosonic QNN by considering both classical and quantum tasks. Starting with a classical task, we simulate the XOR gate. We show that the nonlinear input-output mapping is able to perform nonlinear classical tasks, for example, XOR gate, even without the Kerr nonlinearity. In a practical environment, the measurement error is unavoidable. We then introduce error on the measured occupation number. The Kerr nonlinearity is shown to be able to correct the error. When considering the quantum task, we construct a quantum gate operating on coherent states and generating corresponding Schrödinger cat states. The incoherent loss is interpreted as error in this system, the results show that Kerr nonlinearity offers the capability to resist the error/noise. These results give a clear direction of how to construct and optimize a QNN for performing different tasks (both in classical and quantum regime).