Generating images of hydrated pollen grains using deep learning

Pollen grains dehydrate during their development and following their departure from the host stigma. Since the size and shape of a pollen grain can be dependent on environmental conditions, being able to predict both of these factors for hydrated pollen grains from their dehydrated state could be beneficial in the fields of climate science, agriculture, and palynology. Here, we use deep learning to transform images of dehydrated Ranunculus pollen grains into images of hydrated Ranunculus pollen grains. We also then use a deep learning neural network that was trained on experimental images of different genera of pollen grains to identify the hydrated pollen grains from the generated transformed images, to test the accuracy of the image generation neural network. This pilot work demonstrates the first steps needed towards creating a general deep learning-based rehydration model that could be useful in understanding and predicting pollen morphology.


Introduction
Understanding the distribution, size and shape of pollen grains can be a useful tool in palynology for areas such as climate change [1,2], insect migration [3] and crop health [4]. The morphology of pollen grains vary depending on the genera [5], as well as the developmental stage [6] and hydration state [7], both of which can be influenced by their environment [8]. Since the morphology of dehydrated pollen grains can differ compared with their corresponding hydrated state, the ability to determine a pollen grain's hydrated size and shape from images of its dehydrated state could be useful in inferring developmental or environmental conditions. For example, understanding infolding during dehydration could assist in understanding the design of pollen apertures [9], as well as the critical stages of pollen development [8].
Owing to the vast array of genera, sizes and shapes of pollen grains, which also vary depending on hydration level, developing a generic model using traditional biological physics-based modelling to essentially rehydrate a pollen grain (be it a 2D image [10] or 3D rendered structure [11]) collected in a pollen trap for example, would be extremely challenging. Deep learning, has been shown to be useful for accelerating data-driven biological physics-based models, which would otherwise be very time consuming and labour intensive, owing to their complexity [12,13]. Hence, as an initial step, we use deep learning to generate images of pollen grains in a simulated hydrated state, from images of dehydrated pollen grains of a single genus, namely Ranunculus (see figure 1). In addition, deep learning is also useful for categorizing large amounts of data, such as images [14,15] and, relevant to this work, the identification of pollen grains in images [16,17]. We therefore use deep learning to identify the genera of pollen grain present in the generated images (images of pollen in a simulated hydrated state) using a model that was trained on experiential images of pollen grains from 10 different genera, to aid in proving the success of the image generation neural network.

Sample preparation
For generation of hydrated images of pollen grains, we used Ranunculus. In this work, when the Ranunculus pollen grains were collected, they were already in a hydrated state on the flowers, thus negating the need for a laboratory rehydration process. For testing the accuracy of the generated images, a total of 10 genera of freshly picked flowers (Arnica, Asphodelus, Bellis perennis, Fuchsia, Kalanchoe, Lilium, Penstemon, Ranunculus, Salvia Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. and Taraxacum) from a local convenience store or University grounds were used in this work. The picked flowers were kept with their stems in water with the aim of maintaining a level of hydration for as long as possible. Pollen grains were deposited for imaging by brushing the anthers of the flower (with laboratory grade cotton buds) close to the surface of a glass microscope slide.

Image acquisition
All pollen grains were imaged using a Nikon Eclipse microscope with a 20×objective (Nikon, LE Plan, NA=0.4, working distance=3.5 mm) and a Thorlabs DCC1645C CMOS Camera, which had a 1280×1024-pixel sized colour sensor. To ensure consistency across all microscope images, for each pollen grain, the focal plane of the microscope was set so the top of the pollen grain was in focus. Ranunculus pollen grains are spherical in their hydrated state and become non-spherical and smaller as they dehydrate. Figure 2 shows examples of images (resized and cropped to 128×128 pixels) of Ranunculus pollen grains in (a) their initial hydrated state and (b) subsequent dehydrated state in an air-conditioned laboratory (maintained at 22°C) 30 min later. This time window was chosen since after this time there was no visible noticeable change in the size and shape of the pollen grain. Since the Ranunculus pollen grains underwent visibly significant morphological change within approximately 30 min, as soon as hydrated pollen grains were deposited onto the glass slide, a 24-bit 1024×1280-pixel sized RGB image was taken every 30 s for 30 min, to monitor the dehydration process and ensure pollen grain dehydration had occurred. Once one set of pollen grain images was recorded, the glass slide was cleaned, and new hydrated pollen grains were deposited onto the same glass slide for subsequent imaging. The variation in the background colour and intensity is due to the lighting of the microscope and the position of the pollen grain within the field of view, with the darker blue colour occurring towards the edges of the field of view.
For each pollen genera used in the categorisation experiment, an image was taken immediately after deposition to minimise any dehydration that may occur once detached from the flower. Images of approximately 20 separate pollen grains were taken for each genus. Example images of pollen grains for the 10 different genera, which includes hydrated Ranunculus, are shown in figure 3.

Neural networks
Two neural networks were used in this work. The first was for generating images of hydrated pollen grains from images of dehydrated pollen grains, and the second was for identifying the pollen grains in the generated images.

Image generation
For the generation of images of hydrated Ranunculus pollen grains, we employed a neural network with a generative adversarial-based architecture [18], namely Pix2Pix [19]. Such an architecture has been used for a variety of image-to-image based transformations in science, and also, of particular relevance to this work, pollen grain image generation from scattering patterns [20] and low resolution images [21]. Images of Ranunculus pollen grains taken at the end of the 30-minute image acquisition sequence (frame 60) were used as the input (dehydrated state), and images of hydrated pollen grains taken at the start (frame 1) were used as the target. Both the input and target images were resized and cropped to 128×128 pixels, to reduce the amount of 'empty space' in the images so that the network could obtain higher accuracy. In total, just 80 sets of images were used in training of the neural network and so, to increase the variability of the data, the data were augmented during training, via reflection, rotation, and translation in X and Y directions, with the aim of improving the neural network's training accuracy [22]. The neural network was trained for 5000 epochs using a minibatch size of 4 on an Nvidia Titan Xp graphics processing unit (GPU), taking approximately 8 h.

Pollen identification
A convolutional neural network (CNN) [23] was trained to recognise the 10 different genera of pollen grains listed in section 2.1 (with training data consisting of approximately 20 images for each genera). In the case of Ranunculus, additional micrographs of its hydrated state were recorded in order to avoid reusing images that had previously been used to train the image generation network. Such image reuse could potentially have introduced bias into the pollen identification network, making it easier for it to characterise images from the image generation network as Ranunculus. Again, images were cropped and resized to 128×128 pixels prior to training. The neural network was trained for 500 epochs using an Nvidia GeForce RTX 2070 GPU, with training parameters consisting of a minibatch size of 32 and a learn rate of 0.0002, taking 40 min to train until minibatch accuracy was 100%. Figure 4 shows images of 5 different pollen grains, in rows (a) to (e). As seen in the figure and corresponding labels, irrespective of the size and shape of the pollen grain in the input image, the neural network was able to generate images of hydrated pollen grains whose area were within∼4% of that calculated from the actual experimentally obtained images of hydrated pollen grains. The areas were calculated via initially binarizing each image to obtain the overall shape of the pollen (setting pixels of the image's blue channel with intensity above 55% to 0, and any intensity below 55% to 1), then by setting all pixels within the pollen's identified perimeter to 1, to allow for summing up all the pixels to calculate the area, where 1 pixel∼0.1 μm 2 . Positional offsets of the pollen grains in the generated images seen in column 4 as crescents of background colour are most likely due to movement of the pollen grain during dehydration. Additional error in the generated images could be the result of the initial hydrated state varying between the pollen grains used in both training and testing data. Table 1 displays the calculated Structural Similarity Index Measurement (SSIM), which evaluates the luminance, structure, and contrast between the generated and the actual experimentally collected images. It should be noted that the closer the SSIM value is to the maximum value of 1, the greater the similarity. As shown in the table, the most accurately generated image was figure 4(e) with a SSIM of 0.647. The mean SSIM of all the generated images was 0.615±0.086. The neural network was able to calculate the correct hydrated pollen grain size increase, with a mean increase from the dehydrated state of 13.4% for actual and 13.9% for generated, as shown in table 1. Indeed, there is clear correlation between the actual and generated percentage increase, with an R-squared value of 0.83, showing that the neural network takes into consideration the initial size and appearance when calculating the predicted increase. shows experimental microscope observations of the hydrated state. The third column (generated image) shows an image of the simulated hydrated state (as generated by the neural network, from the image shown in the input column). The fourth column (error) shows the difference between the experimental image of the hydrated state (actual image) and the NN generated image (generated image). For the images in the fourth column, the darker the pixels, the more accurate the generation. Inset purple text in the images is the area of each pollen grain.

Results and discussion
As shown in the confusion matrix of figure 5(a), the identification neural network was 100% accurate for categorizing the experimental images of pollen grains. This gave us confidence to test the categorization network with generated images of hydrated pollen grains. The neural network was also 100% accurate in identifying the generated images as Ranunculus pollen. As shown in figure 5(b), the categorization probabilities (averaged over all generated images) predicted Ranunculus with∼95% confidence. The next nearest probability was Lilium with∼5%, perhaps due to similarity in colour, as observed in figure 3.
More training data and higher fidelity images will perhaps allow for greater accuracy in the image generation and image identification neural networks. Furthermore, since pollen grains vary in three dimensions, imaging methods such x-ray imaging [11] and tomography [24], would provide neural networks with additional information in understanding the dehydration process, and allow for image transformation of a variety of spherical and non-spherical pollen grains. Additionally, to enable larger collection of data from a variety of pollen genera that includes collected dehydrated pollen grains, a laboratory rehydration procedure should be employed. A more general biological model could be created by expanding the dataset even further to other biological cell transformations, such as for red blood cells [25].

Conclusion
We have shown the ability to use deep learning to transform images of dehydrated Ranunculus pollen grains into images of hydrated pollen grains. The average error between the actual and generated hydrated images,  determined via SSIM calculations was 0.615±0.086. We have also used deep learning to correctly identify Ranunculus pollen grains from their generated hydrated images, with∼95% confidence. Future work could explore using the deep learning techniques described here for rehydration of multiple genera, which would likely require several orders of magnitude more data and perhaps higher resolution images. Future work could explore in greater detail the process of pollen dehydration and provide greater understanding of the underlying biological physics. In addition, the ability to create images of pollen grains in different hydrated states could allow the creation of large volumes of training data for a neural network used in sensing different genera of pollen with different levels of hydration, without the need for experimentally obtaining large volumes of microscope image data, which would be very time consuming.