Estimation of Zernike polynomials for a highly focused electromagnetic field using polarimetric mapping images and neural networks

In this communication, we present a method to estimate the aberrated wavefront at the focal plane of a vectorial diffraction system. In contrast to the phase, the polarization state of optical fields is simply measurable. In this regard, we introduce an alternative approach for determining the aberration of the wavefront using polarimetric information. The method is based on training a convolutional neural network using a large set of polarimetric mapping images obtained by simulating the propagation of aberrated wavefronts through a high-NA microscope objective; then, the coefficients of the Zernike polynomials could be recovered after interrogating the trained network. On the one hand, our approach aims to eliminate the necessity of phase retrieval for wavefront sensing applications, provided the beam used is known. On the other hand, the approach might be applied for calibrating the complex optical system suffering from aberrations. As proof of concept, we use a radially polarized Gaussian-like beam multiplied by a phase term that describes the wavefront aberration. The training dataset is produced by using Zernike polynomials with random coefficients. Two thousand random combinations of polynomial coefficients are simulated. For each one, the Stokes parameters are calculated to introduce a polarimetric mapping image as the input of a neural network model designed and trained for predicting the polynomial coefficients. The accuracy of the neural network model is tested by predicting an unseen dataset (test dataset) with a high success rate.


Introduction
Highly focused beams and their properties have been investigated in many fields such as nonlinear optics, super-resolution microscopy, tomography, optical encryption, and optical tweezers [1][2][3][4][5].Tightly focused beams attract much attention because of the non-neglectable component of the electric field in the direction of propagation.Nevertheless, the direct visualization of the longitudinal component is a challenging issue [6].Despite the several proposed techniques, which are complex and/or designed for specific applications [7,8], we recently proposed a relatively simple method for visualizing the longitudinal component of highly focused beams using conventional imaging systems [9].The procedure is based on retrieving the transverse components using a phase retrieval algorithm.Like most phase retrieval algorithms, the method demands capturing the intensity patterns of transverse components at two different planes in the depth of focus.Accordingly, the longitudinal component can be retrieved by applying Gauss's theorem to recovered transverse components.Since retrieving the longitudinal component depends directly on the quality of retrieving the transverse components, considering any possible error in the optical setup is essential.One of the main errors in a complex optical system is the aberration caused by optical elements and the misalignment.
In [10], the authors proposed a practical method applicable for reconstructing the wavefront of a focused beam from a measured diffraction pattern.Since the phase property of optical waves can not be recorded by intensity detectors (for instance, CCD cameras), they applied a phase-retrieval framework based on a neural network to solve the phase problem.In their approach, the neural network is trained by the labeled training datasets obtained by simulating aberrations according to Zernike polynomials.On the one hand, the labeled objects (outputs of the neural network) are obtained by applying the Fourier transform to a Gaussian beam which is multiplied by random phases obtained by Zernike polynomials with random scalar coefficients.On the other hand, the neural network inputs are diffraction patterns obtained by simulating the propagation of the objects through a diffractive mask consisting of a thin, absorbing metal film on a silicon nitride membrane with holes.
In this communication, we present an alternative approach to estimating the Zernike polynomials at the focal plane regarding a highly focused electromagnetic field by means of the polarization property of optical beams.Despite the phase, the polarization state of optical waves is simply measurable using a linear polarizer followed by a quarter-wave plate and a CCD camera.In this regard, a radial polarization is imposed on aberrated Gaussian-like beams; Then, the focused fields are simulated through a high-NA microscope objective to calculate the Stokes images at the focal plane.Accordingly, we introduce the polarimetric mapping images (PMI) to a convolutional neural network in order to map the Zernike polynomials coefficients.Our approach takes advantage of recording the Stokes images at a single plane (focal plane) and eliminating the necessity of phase retrieval provided the beam is known.The method is partially based on the methodology we recently introduced in [11].
The following text is organized as follows: Section 2 reviews the basic concepts used in this work, including the propagation of a highly focused electromagnetic field and Zernike polynomials.Section 3 describes the simulation procedure, including the optical setup, the training dataset, and the neural network model.Section 4 discusses the results with possible improvements.Finally, the last section is dedicated to conclusions.

Highly focused electromagnetic fields
Richards and Wolf proposed the vectorial diffraction integral for calculating the focused field of monochromatic lights through an aplanatic lens [12].Instead of direct integration, the vectorial diffraction integral can be written in terms of the Fourier transform [13], in which the focal field E at the point ( , , ) x y z respecting the focal point (0,0,0) is given by with ( ) ( ) where  E is the electromagnetic field at the Gaussian sphere of the reference (the focal point), and are the spherical coordinates with the origin at the focal point, x , y , and z are the Cartesian coordinates.Besides, FT[...] represents the two-dimensional Fourier transform,  and f are the illuminating wavelength and the focal length of the focusing system, respectively, , sin sin sin cos sin (5) Therefore, the transverse and the longitudinal components of a radially polarized incident beam 0 E at the focal plane ( 0 z = ) can be obtained by means of equations ( 1) to (5) as follows: ( ) ( ) ( , ,0) 4 FT sin cos Moreover, the Stokes parameters can be obtained by  , and  and  stand for the orientation angle of a linear polarizer and the phase delay introduced by a retarder ( for a quarter-wave plate 90º  = respectively. Figure 1.Notation and coordinate systems at the entrance pupil, Gaussian sphere of reference (exit pupil), and the focal plane.

Zernike polynomials
The Zernike expansion represents wavefront aberration functions, which is a set of polynomials indexed by the nonnegative integers corresponding with the degree of polynomials that are orthogonal on a circular pupil.The even and odd Zernike polynomials are defined by ( , ) ( )cos( ) with a special value (1) 1 m n R = .

Simulation
In this paper, we consider a numerical framework adjustable to the experimental setup that we used to estimate the longitudinal component of highly focused beams.Without loss of generality, our approach can be adapted to any complex optical system dealing with aberrations.The numerical calculations have been carried out using Python 3.7.5 and a Laptop with CPU i7-1165G7 (2.8 GHz) and 16 GB RAM.
Besides, the neural network has been implemented by TensorFlow 2.1 on GPU: NVIDIA GeForce M450.

Optical setup
The optical setup shown in figure 2

Training dataset
As explained in section 2, we calculate aberrated focused fields and subsequently Stokes images and Stokes parameters corresponding with a Gaussian-like beam with random phases described by Zernike polynomials.Hence, the input beam described in equations ( 6) to ( 8) is ( ) where 2 2 r y x =+ and 0 w is the beam waist.Since we consider the radial polarization, the singularity at the origin is solved by multiplying r to the beam.The imposed phase caused by aberrations is introduced by z  , which is obtained by a combination of the random indexed degrees of the first 15 Zernike polynomials as follows:  Furthermore, the number of the dataset is increased to 18000 by adding Gaussian noises to each PMI. Figure 3 demonstrates the Stokes images obtained for the radially polarized Gaussian-like beam without aberrations at the focal plane.Figure 4 indicates the Stokes images obtained for a radially polarized beam with aberrations included in the training dataset.Figure 5 shows the normalized Stokes parameters, which form a PMI, corresponding with Stokes images shown in figures 3 and 4. Finally, the total dataset is split randomly into two sets: 80% for the training dataset (14400 PMIs) and 20% for the test dataset (3600 PMIs) [14].The model is compiled with the root-mean-square error, the cross-entropy loss, and the as the optimizer, loss function, and performance metrics, respectively.To avoid overfitting, we monitored the accuracy and the loss obtained after each epoch by separating a randomly-selected 20% of the training dataset as a valid test dataset.Note that the valid test dataset differs from the test dataset mentioned previously.In this regard, the training process was stopped after 50 epochs, when the error loss was decreasing while the valid error loss began increasing.To sum up, the input of the model is PMIs with a size of 50×50×3, and the output is the predicted Zernike polynomials coefficients, which are saved previously regarding each class label.The obtained accuracy and the loss error of the classification are 0.989 and 0.033, respectively.The obtained accuracy for predicting the test dataset is 0.978, which shows a high rate of success for predicting the unseen dataset.

Discussions
We demonstrated that by means of PMIs the neural network model could successfully classify 2000 possibilities of z  and predict an unseen test dataset.However, the training dataset can be easily enriched by adding more possibilities.The first 15 aberrations were considered: oblique astigmatism, horizontal and vertical coma, primary spherical, defocus (longitudinal position), tilt in x-and ydirection, etc.This set of aberrations can be modified based on the experimental setup used for a specific application.As a result, the training dataset can be practically modified based on the type of application.
Although we considered a uniform range ([0,0.01]) of the Zernike polynomials coefficients ( i c ) but also this range might be practically selected as non-uniform coefficients based on the used experimental setup.We arbitrarily synthesized a radially polarized Gaussian-like beam in order to search for the coefficients of the Zernike polynomials.However, the other wave functions which are properly polarized can be easily applied.The selected polarization state of the incident beam should provide different intensity patterns of the Stokes images.Therefore, any polarization state that fulfills this condition can be applied, for instance, spiral polarization or spatial-variant polarization.Our approach aims to instant detection of aberrations for highly focused electromagnetic fields.In particular, the application of our approach might be extended to instant detection of the aberrated longitudinal component of highly focused fields without the necessity of fully retrieving the transverse components.

Conclusions
In this communication, we presented an approach to instant detection of the Zernike polynomials at the focal plane of a highly focusing system based on neural networks and polarimetric information.We imposed a radial polarization on aberrated Gaussian-like beams to obtain different intensity patterns of the Stokes parameters at the focal plane.The normalized Stokes parameters have been formed as an image with three channels (PMI).PMIs are the inputs of the neural network model designed and trained to map the Zernike polynomials coefficients from the corresponding polarimetric information.Our approach provides an alternative method to reconstruct the aberrated wavefronts, provided the beam is known.Also, this approach might be comparable with those approaches that require phase retrieval algorithms to solve the phase problem.Besides, the proposed method allows the adjustment of complex systems by synthesizing a properly polarized beam at the entrance pupil of a focusing system.
spatial frequencies in the Cartesian coordinates, in which

E 2 e 2 e
work, the electric field is limited to an area where evanescent fields are not included, as 2 is the incident field, in which a radially polarized incident beam forms 00 ( cos , sin ,0) are unit vectors in the radial and azimuthal directions, and is the projection of 2 i e on the convergent wavefront surface.Figure1sketches the mentioned details.The unit vectors 1 e ,

3 I
where n and m are nonnegative integers with the condition 0 nm ,  and  are the azimuthal angle and the radial distance, respectively, and m n R are the radial polynomials given by

2
consists of three main parts.The first part generates a radially Gaussian-like beam and images it at the entrance pupil of microscope objective MO1 (NA=0.75).The second part provides vectorial diffraction by means of MO1.The third part images the focused field at the sensor plane of the CCD camera by means of MO2 (NA=0.8)mounted on a variable stage with a mobility precision of ±100 nm.The linear polarizer and the quarter-wave plate placed at the front of the CCD camera are used for obtaining Stokes images.Spatial sampling of 37.5 nm and a magnification of 100 are obtained by imaging a 1951 USAF resolution test placed in front of MO2 at the sensor plane of the CCD.These details are used for simulating the focused fields.

Figure 2 .
Figure 2. Optical setup.LP, VR, MO, QWP, and CCD stand for a linear polarizer, a vortex retarder, a microscope objective, a quarter-wave plate, and a charge-coupled device, respectively.Figure adapted from [9] under a Creative Commons BY 4.0 license.
random-selected coefficient ranging from 0 to 0.01.We consider 2000 random possibilities of z  .We calculate the corresponding Stokes parameters at the focal plane for each possibility to form an image named polarimetric mapping image (PMI).A PMI consists of three channels 10 SS , 20 SS , and 30 SS .Thereby, these PMIs are the inputs of the CNN model for training the machine.In general, we calculate 2000 PMIs corresponding with 2000 random possibilities of z  .

Figure 3 .
Figure 3.The Stokes images obtained at the focal plane correspond with the radially polarized Gaussian-like beam.

3. 3 .
CNN modelThe neural network used in this work is a sequential model including three convolutional layers with 32, 64, and 128 filter sizes and a kernel size of 3, accompanied by the hyperbolic tangent activation function.The batch normalization and average pooling layers are applied after each convolutional layer.Then, the resulting feature map is flattened into an on-dimensional array as an imaging outcome.Correspondingly, the image outcome is connected to 3000 neurons applying a dense layer by means of the sigmoid function.Next, 25% of connected neurons are removed by a drop-out layer.Finally, the last dense layer provides 2000 probability distributions ranging from 0 to 1 by means of the softmax activation function, which is used to label 2000 possibilities of z  .

Figure 4 .
Figure 4.The Stokes images obtained at the focal plane corresponding to the radially polarized Gaussian-like beam with a random phase obtained by Zernike polynomials.

Figure 5 .
Figure 5.The normalized Stokes parameters correspond to the radially polarized Gaussian-like beam without aberrations (the first row) and with a random phase obtained by Zernike polynomials (the second row).

Figure 6
indicates the corresponding longitudinal component of the Stokes parameters shown in figure 5.

Figure 6 .
Figure 6.The longitudinal components obtained by focusing the radially polarized Gaussian-like beam (a) without aberration (b) with aberrations described by the Zernike polynomials.