Deep-learning-based quantum imaging using NOON states

The phase sensitivity of photonic NOON states scales O(1/N), which reaches the Heisenberg limit and indicates a great potential in high-quality optical phase sensing. However, the NOON states with large photon number N are experimentally difficult both to prepare and to operate. Such a fact severely limits their practical use. In this article, we soften the requirements for high-quality imaging based on NOON states with large N by introducing deep-learning methods. Specifically, we show that, with the help of deep-learning network, the fluctuation of the images obtained by the NOON states when N = 2 can be reduced to that of the currently infeasible imaging by the NOON states when N = 8. We numerically investigate our results obtained by two types of deep-learning models—deep neural network and convolutional denoising autoencoders, and characterize the imaging quality using the root mean square error. By comparison, we find that small-N NOON state imaging data is sufficient for training the deep-learning models of our schemes, which supports its direct application to the imaging processes.


Introduction
One of the most important issues in modern optics is to find new methods that break the limitations of the traditional imaging schemes. In the past 60 years, with the development of modern nonlinear optics, the imaging schemes based on the properties of quantum light are proposed as possible solutions to the above issue [1]. Those quantum imaging schemes are theoretically investigated and experimentally demonstrated, displaying the characters of sub-shot-noise [2][3][4], super-resolution [5][6][7][8][9][10], or the capability of extracting the object information with undetected photons [11][12][13][14]. Among the candidates of the quantum state of light for imaging, photonic NOON states have caught much attention. This is due to the fact that the interference characters of the NOON state exhibit a phase sensitivity scaling 1/N (Heisenberg scaling) [15,16], which is an improvement over the classical light interferometry and therefore facilitates the applications to different types of quantum imaging [17][18][19][20].
Obviously, the NOON states with large photon number N are more advantageous in the high-quality imaging than the states with small photon number. However, the schemes involving large-N NOON states are by-all-means difficult to implement. Because the photons of a NOON state are highly correlated, to observe the ultra-sensitivity of it requires the identification of the correlation whose complexity grows exponentially with the photon number. By far, only several strategies utilizing the measurements of a few correlated photons are realized for NOON state imaging under the laboratory conditions, including multi-photon absorption [18], optical centroid measurements [21,22], post-selection measurements [20], etc As special treatments, they are hard to be extended to the detection of the NOON states with large photon number, e.g., when N8. Moreover, the fragility of quantum states is also fatal for NOON states imaging. The fluctuation of the number of the correlated photons immediately destroys the phase sensitivity, which is even harder to avoid for large-N NOON states.
In general, the application of the quantum imaging by NOON states is facing a dilemma: the small-N NOON states are easier to generate and manipulate but less advantageous in imaging, while the large-N NOON states Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. provide a higher phase sensitivity for imaging but are hard to maintain and measure. Here, we present our solution to the above dilemma by combining the ease of imaging by small-N NOON states with the good imaging quality as enabled by large-N NOON states. Such a combination is realized by the deep learning methods. In recent years, the development of deep learning has received extensive attention. As a branch of machine learning, the key idea of deep learning is to establish a multi-layer-neural-network model to describe the hidden relations behind the data samples of a concerned problem [21]. Albeit the model of the finite samples is actually an approximation of the exact solution to the problem, it is effective enough to address plenty of down-to-earth issues at a certain level of accuracy. For example, the model of the relation between the noisy images and their original data has proven to be a powerful tool for image denoising, and has already applied to computer vision [22,23], traditional optical imaging [24][25][26][27][28][29], thermal-light ghost imaging [30], etc In the task of NOON state imaging, if a mapping of the imaging space of the small-N NOON states to that of the large-N NOON states can be found, one can naturally use the mapping to obtain the large-N-NOON-state-level images from the raw data gathered by small-N-NOON-state imaging process. In our work, we show that the demanded mapping can be modelled by the above multi-layer-neural-network.
Specifically, we consider two broadly used deep learning models. One is deep neural network (DNN), the other is convolutional denoising autoencoder (CDAE). The data samples are obtained by the simulation of the images of the handwritten digits figures in MNIST [31]. The figures are set to be the polarization birefringence objects, which are suitable to be imaged by NOON state interferometry. The raw data is composed of the noisy images obtained by NOON state imaging when N=2. The training dataset is then set to be the raw data labelled by the corresponding original figures. After training the networks by the dataset, we find that the noise level of the images by N=2 NOON states can be reduced to those by N=8 NOON states. Besides, our simulation also indicates that such an improvement can be implemented by only the tests on the simplest NOON state imaging process. This means the method could provide a rather friendly starting point for the daily application of NOON state imaging. Besides, we notice that a series of works on optical implmenations of nerual network have been presented [32][33][34]. These work indicates that well-trained nerual networks can provide new insights about designing optical elements for specific aim. This is different from the previous optical design based on the scalar diffraction theory in the continuum regime. Hinted by nerual networks, the optical processing of the information, such as the imaging process we consider here, can be arranged in a digitized way. At the required level of accuracy, each bit of information encoded by light could be operated, which brings more alternatives for modulations. What we do here is also a numerical support for such perspective.
The rest of this article is arranged as follows. Section 2 introduces the NOON state imaging scheme considering quantum noise. Section 3 discusses NOON state imaging using DNN. Section 4 discusses the imaging of NOON states using CDAE. Section 5 concludes.

The quantum noise in NOON state imaging
A photonic NOON state is generally a two-mode entangled state, with all N photons in one mode or the other. It can be expressed by where a and b represent two optical modes. In different detection schemes, they could be polarization states, frequencies, or propagation paths. Here, we adopt the quantum polarized light microscope imaging scheme, in which a and b are two orthogonal polarization states. Such a scheme can be implemented by the setup shown in figure 1, and a clear Heisenberg scaling can be observed [20]. Specifically, consider a multi-photon NOON state, given by Such a NOON state manages to detect birefringent samples. Without losing generality, we consider the cases when the phase shift caused by the sample is very small (from 0 to 1 rad). After illuminating on the samples, the NOON state changes into |y¢ñ, expressed by where ( ) f x y , represents the phase shift induced by the object and , denotes the Fock state with n horizontally polarized photons and -N n vertically polarized photons. n is an integer ranging from 1 to N. Then, the sample information contained in the phase factor can be decoded by photon-number-resolving measurement. For example, if one sets the projector to be the polarized photon number state, the coincident probability changing with the phase can be given by 2 mis also an integer ranging from 1 to N. Based on equation (4), the phase factor f can be retrieved by inverting the experimental estimation of Therefore, the image of the sample can be obtained by repeating the procedure on each location of the given birefringent sample. Due to the fluctuation of the quantum state, the above imaging scheme is also affected by quantum noise. Such an affection can be quantified by the phase measurement uncertainty, represents the uncertainty of the coincidence rate [16,35]. Usually, the experimental noise is distributed in Gaussian. Therefore, we model the experimentally estimated phase by . It has been proved that for NOON states [16,36], as we consider here. Unfortunately, the analyzer for the multi-photon state is far from ideal and -P m N m , decreases fast when N increases. Therefore, the Heisenberg limit for large-N NOON states is harder to reach.
We focus on the measurement of / / P N N 2, 2 (for even N ). This is because / / P N N 2, 2 is the biggest among all the probabilities for a given N, indicating that it is easier to measure. To investigate the affection of the noise under different conditions, we consider the three cases as the examples, when the photon number N=2, 4, 8 respectively.
By far, the NOON state imaging when N=8 has not been experimentally implemented. Based on equations (4) and (5), the corresponding parameters of the noise model can be given in table 1. The birefringent samples we consider in the numerical simulation are generated using the famous MNIST handwritten digit database. Specifically, the gray scale information of pixels (ranging from 0 to 255) of one figure in the database is linearly mapped to the phase shift (ranging from 0 to 1 rad) of the corresponding locations of a sample. Therefore, we can simulate a number of birefringent samples using the digit figures in the MNIST. As discussed above, the images of those samples can be obtained by scanning them point-by-point with the optical NOON states. Based on the noise model given in table 1 and the imaging processes, the simulation of the measurement results of the samples can be given and presented in figures 2(b)-(d). In order to benchmark the quality of images, we introduce the root mean square error (RMSE) [29]. Suppose that the matrix representation of an image is denoted by E whose entry E i j , gives the pixel value (gray scales, phase shifts, or others). For twoḰ L image matrices E 1 and E , 2 the RMSE is defined by where E i j 1; , and E i j 2; , are the entries of E 1 and E 2 receptively. The scores of the simulated images are shown by the first three rows of table 2, which quantitatively show the blurriness of the images obtained by different NOON states illumination. It can be seen from the results that when the photon number N increases, the noise level of the images clearly decreases, indicating an improvement of the imaging quality.
In order to bridge the gap between the large-N NOON state imaging and small-N NOON state imaging, as mentioned above, we introduce the deep learning methods to generally model the mapping of the imaging data of the two schemes. The computing of the mapping, which is equivalent to the training of the deep learning model, requires a sufficiently large dataset. The training dataset we consider here is composed of the images of the birefringent samples obtained by the simulation of the N=2 NOON-state-imaging scheme. Each sample of the training dataset is labelled by its corresponding noise-free image. In a real experiment, obtaining such a dataset requires a series of imaging tests which can be performed by current techniques. The reason why we

NOON state imaging based on DNN
The DNN method is a supervised deep learning framework designed to generate a computational model using a large number of samples. The computational model of the method is a multilayer network composed of linearly and fully connected layers and nonlinear activation functions associated with each node (usually termed a neuron in the language of DNN) of the layers. The layers of the network can be generally divided into input layer, hidden layer and output layer. When dealing with a specific task, the parameters of the network (weights and bias) are adjusted for an optimal output with minimal difference from the demanded output. More specifically, the function G P of the pth layer of the network can be defined by where W P is the linear transformation of the data vectoru p 1 input to the pth layer. B P is the bias vector and f is the nonlinear function. The mapping of the M-layer DNN F DNN can be given by whereu, v are two data vectors. In our task, the training dataset is denoted by q q whose ( ) O L q q denotes the matrix of the image (label) of a sample obtained by simulating the NOON-state imaging when N=2. Our task is to find a F DNN such that the measure of their difference ( ( ) ) Q F O L , q q DNN reaches its minimum. The function Q is usually called loss function and has different forms in particular tasks. Instead of directly using the image set obtained by the N=8 NOON state, here we consider the noise-free image dataset L . q This is because such a dataset is easy to generate in the calibration test of the imaging process, when the sample information is given. Also, it is effective for our task. According to the DNN method, the optimization can be performed by back propagation, and then network parameter W P and B P can be finally obtained.
The optical mimicking of a DNN has been presented by [32], termed by all-optical diffractive DNN. The processing of the data by the DNN layer has been mapped to the propagation of the light diffracted by 3Dprinted structures. This could be a foundation for DNN-based optical design. Here we focus on the DNN that could bennifit the NOON state imaging. The specific DNN model we employed is shown in figure 3. It includes three hidden layers, an output layer and an adjustment layer for data resizing. The noisy images are preprocessed and transformed from a28 28 matrix to a1 784 column vector. All hidden layers and output layers also have 784 neurons. The neurons in each adjacent layer are connected by weighted links, and the output vector data is adjusted to28 28 matrices for the convenience of image display. The nonlinear activation function of each layer is chosen to be the Rectified Linear Unit (ReLU) function, considering that the training of DNN with ReLU is generally faster and more effective on large and complex data sets. The optimizer is Adam [37]. The parameter updating rate (learning rate) is set to be´-8 10 , 5 and the loss function is the mean square error function (MSE). In order to prevent overfitting [38], the forty percent links the neurons are randomly disconnected in an update step (dropout rate equals to 0.4). In the training process, we input the training dataset mentioned above, which contains sixty thousand samples, into the DNN model, and save the model parameters after each epoch of training. The training dataset is divided into several batches, and only one batch is used for an update of the parameters in an epoch. Each batch has 128 samples and the union of all batches is equal to the total training dataset. To monitor the training process, we also input ten thousand noisy images into the DNN model at each stage and calculate the average loss function value. The dependence of the loss function of training dataset and testing images on epoch number is shown in figure 4. The training terminates after five hundred epochs. We can see that the value of the loss function MSE is finally small enough, which means the training process has already converged. Therefore, the DNN model for NOON-state imaging is obtained.
The predicted results of noisy image examples are shown in figure 2(e). Compared with the images in figure 2(b), it can be seen that the DNN we obtained can effectively reduce the noise level of the images. More quantitatively, we input one thousand randomly chosen noisy images (involving all ten digits) into the trained DNN model and calculate the average RMSE of each digit. The results are shown by the fourth row of table 2. Combined with the data in the first to third rows in table 2, with the help of the DNN model, the noise level of all digital images by NOON state when N=2 can be even lower that by NOON state when N=8.

NOON state imaging based on CDAE
A CDAE is a denoising autoencoder equipped by convolution operations [27]. Traditionally, an autoencoder is a neural network trained to learn an approximation to identity mapping by back propagation. A denoising autoencoder is an extension to the autoencoder [39]. For dealing with stochastically corrupted input, the denoising autoencoder has to predict the original data from the corrupted values. Compared to standard denoising autoencoder, CDAE can take advantage of the convolution operation and has proven to be more suitable for imaging process [28].
The main structure of a CDAE can be described by the encoding part, the decoding part. Consider a matrix set { } W , r representing the set of linear transformations for feature analysis in a CDAE. Then, the encoding of a given data sample vector u can be expressed by the feature map h r is the output of the encoding. ⁎ denotes the convolution. W r denotes the rth matrix of the set and B r denotes the related bias vector [27,28]. s is generally a composite function, including the nonlinear activation and others. Following that, the decoding is given by ¢ s is also a composite function involving nonlinear activation but not necessarily equal to s C is a bias factor. Combining equations (9) and (10), one can also obtain a mapping F . CDAE Like the DNN case discussed above, our aim is also to adjust the parametersB W W , , r r r and C so that the loss function ( ( ) ) Q F O L , q q CDAE reaches its minimum.
Here we use a relatively simple CDAE, shown in figure 5. The encoding process consists of two convolutional layers and two pooling layers. The decoding process consists of three convolutional layers and two upsampling layers. The kernel of the convolution is set to be a3 3 matrix. In the first encoding layer, we use thirty two kernels for convolution. If the size of input data is28 28, it changes to´28 28 32 after convolution, and to´14 14 32after MaxPooling operation. In the following layers, the convolution is done by 'highdimensional kernels', usually called filters. For a .. data, the corresponding filter is a´3 3 32matrix, composed of thirty two3 3 kernels. In each of the second to fourth layer, we apply thirty two filters to one convolution, activated by the ReLU function. The last convolutional layer only uses one´3 3 32filter, activated by the sigmoid function. The optimizer is also Adam. The loss function is the binary cross-entropy (BCE) [40], and the batch size is set to be 128 as well.
The training process is generally similar to that of the DNN. The training data set and the test data set for monitoring the procedure are the same with those considered in the DNN training. The plot of the average loss function value of training dataset and testing images is shown in figure 6. The training terminates after two hundred epochs and the convergence of loss function has been observed. Therefore, the CDAE model for NOON-state imaging is obtained, the training time of which is shorter than that of the DNN.
The predicted results of noisy image examples are shown in figure 2(f). Compared with the images in figure 2(b), the reduction in noise level can also be seen. We also input the same number of randomly chosen noisy images into the trained CDAE model and calculate the average RMSE for a quantitative display. The results are shown by the fifth row of table 2. It can also be seen that the CDAE model also suppress the noise level of the images by NOON states when N=2 to that by NOON states when N=8. The effect of suppression is even better than that obtained by the DNN.
As far as we known, there is no direct implementation of CDAE using optical setup at present. The results we present above might be considered as an innovation for the research on related setup. Figure 6. The performance of the CDAE with respect to epochs. Y-axis is the average value of BCE. In this part, because the CDAE converges faster, we do two hundred epochs of training in total. The line style mark train loss and val loss represent the same meaning with that in figure 4. Figure 5. The model architecture of CDAE. The size of MaxPooling operation is2 2whose stride length is 2. It means that a matrix being operated by MaxPooling will shrink to a half-sized matrix. On the contrary, UpSampling operation here double the size of the input matrix by outputting the tensor product of the input and a2 2all-ones matrix.

Conclusion
In conclusion, we propose to solve the current dilemma for NOON state imaging by introducing the deep learning method. In particular, we consider two examples of deep learning model DNN and CDAE, and present our numerical analysis. We measure the noise level using classic RMSE, and by comparison, we find that both DNN and CDAE can effectively denoise the images obtained by NOON state imaging scheme. Shown in figure 2 and table 2, the noise level of the case when N=2 can be reduced to that of the case when N=8. The results indicate that one can obtain the images as good as illuminated by large-N NOON states using only the simplest NOON state, which promotes the application of the quantum imaging.
Besides, we would like extand a little bit about our conclusion. Because the flucation of the images obtained by the NOON states illumination lies on the statiscal properties of the photons in such states, our results give a clue that one might consider to employ machine learning models as the mappings of the entire distribution of the large-N NOON states to that of the small-N NOON states. If so, a more convient way to generate or measure the large-N NOON states can be expected, whose cost would be shrinked to the level of that by generating or measuring the small-N NOON states. Meanwhile, as we briefly discussed in the above, the recent progress in classcial optics [32][33][34]41] have shown that the optical nerual networks exhibit a great potential for light modulation. Due to the Klyshko's advanced-wave picture [42], we believe that those optcal nerual networks are also workable for the manipulation of quantum light. Therefore, the above mappings based on the machine learning models are in principle realizable. We leave the subject to our furture reseaches.