SeedAI: a novel seed germination predictionsystem using dual stage deep learning framework

D Ramesh Reddy; Ramalingaswamy Cheruku; Prakash Kodali

doi:10.1088/2515-7620/ad16f1

1. Introduction

The largest livelihood provider in India is agriculture and its allied sectors. They bestowed significant digits to India's Gross Domestic Product (GDP). Demand for food rises with the population increase. Crop productivity improvements have been critical in feeding a rising population while also decreasing the environmental effect of food production. Lowering the amount of land utilized for agriculture by raising crop yields [1]. Approximately 200 million tons of food grains are produced annually [2]. Sustainable agriculture requires a seed as crucial input [3]. Other input's response depends to a considerable extent on the quality of seeds. Figure 1 depicts the global crop yields in hectares and tons. The yield of crops relies not only on different environmental facets and mainly on the quality of seeds [4]. The process of seed germination is a fundamental aspect of a plant's life cycle [5]. The essential phase for plant production is the seeding stage [6]. Germination is the capability of a seed underneath usual sowing circumstances to give rise to a normal seedling when planted and evolving into a new plant [7]. The seed germination process begins with the moistness [8] by the seed coat and the emergence of the developing root tip of the embryo [9]. The process ends with the embryo's growth into a seedling [10].

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** Block diagram of proposed two stage model.
Download figure:
Standard image High-resolution image

The contributions of this work is summarised as follows:

A growth chamber system is designed specifically for collecting data set of seeds during their germination stage.
A novel two stage network is proposed to accurately identify seeds within petri dishes and classify them based on their germination status.
The proposed model is compared with other state-of-the-art models.

The rest of the paper is organised as follows: section 2 presents a comprehensive examination of machine learning models utilised in the context of seed germination. Section 3 provides overview of proposed methodology and detail discuss about data set, metrics and outcomes of the experiments are elaborated upon in section 4. In section 5 concludes by presenting the findings derived from these experiments.

2. Literature review

The usage of machine vision technology has been extensively implemented across diverse domains of agricultural production [11]. Many studies have presented digital image processing for non-destructive rice seed modification, with the majority of rice seed pictures processed utilizing three key aspects: texture [12], color [13], and morphological [14] features [15]. The author has presented an approach that utilizes the Hue Saturation Value (HSV) colour model for the purpose of classifying the germination status of seeds based on the disparities in colour between the seeds and roots. This method has been effectively employed in the germination process of seeds [16].

An automated monitoring system was developed to oversee the complete sunflower seed growth process. The system employed colour, edge, and other relevant data, and was implemented under controlled lighting, temperature, and humidity conditions. This system can serve as a foundation for designing environmental parameters during the breeding process [17]. The aforementioned methods are predicated upon established image analysis techniques that predominantly evaluate the germination status through the chromatic attributes of the seed coat and sprout. As a result, their applicability is limited to particular seed types and requires precise experimental conditions. In addition, the utilization of these methodologies necessitates advanced image acquisition apparatus, thereby restricting their extensive application. In addition, the market offers a limited number of automated solutions specifically developed for the purpose of performing seed germination assays. The Germination Scanalyzer is a sophisticated automated apparatus that was created in Germany to detect seed germination [18].

The present system employs distinctive blue or grey filter paper as a substrate for the placement of seeds. The process involves the determination of the center of mass through the measurement of seed weight, while the length of germination is ascertained by analyzing the color characteristics of seeds, buds, and the background. Subsequently, this data is utilized to ascertain the state of germination. The efficacy of the Germination Scanalyzer's methodology has been demonstrated to fulfil pragmatic criteria in regards to its consistency and precision. It is important to acknowledge that this approach possesses particular constraints and distinct utilization prerequisites. The procedure necessitates the use of colour filter paper and entails precise criteria for seed and bud hues, thereby rendering it appropriate for examining solely particular seed varieties.

The Wanshen seed automatic counting instrument, which was domestically developed, is another automated tool used for seed germination testing [19]. This device aids in the automation of the process of counting seeds. In contrast to the previously mentioned approach, this method does not necessitate a prerequisite for the colour of the seed. Rather than manual counting, this method utilises shape-fitting algorithms to achieve precise seed counting automatically. Nonetheless, in the case of this particular apparatus, it is imperative that the seeds be positioned on a designated plate that emits white light. Although the device is capable of automatically providing the total number of seeds, it lacks the ability to automatically assess the germination status of said seeds.

The agriculture domain [20] lacks explicit reference to the utilization of AI-based autonomous monitoring and predictive tasks [21]. Furthermore, in terms of autonomous operation, the restricted energy storage and elevated power consumption have been a significant area of focus. As previously observed, the integration of computer vision [22, 23], and machine learning [24] has the potential to effectively tackle the issue of seed germination control in the context of industrial automation.The majority of the aforementioned methods do not incorporate an automated data collection model for acquiring various germination states. All of the models employed in the study employed pre-existing models and used pretrained models for the purpose of classification. In this study, a novel approach for extracting seeds from a petri dish, followed by their classification using a customised classification model. Our proposed method aims to enhance performance and accuracy in seed classification.

Germination is one of the main seed quality testing. Seed industries, Seed breeders and Agricultural Reserach Institutes requires seed testing for production of hight quality seeds, The current research uses image processing technology for seed germination and are mostly manual and labories, Proposed novel system is utilised for seed detection using MASK-RCNN and germination classification using proposed CNN Model. There by leveraging its architectures for object recognition and instance segmentation and tailoring them to the unique needs of finding and classifying seeds in images.

The proposed system aims to collect dataset comprising diverse seed samples and employs a deep learning model employing Mask RCNN for segmenting. Classifying seeds based on their germination status.

3. Proposed methodology

For effective automation of seed germination process a two stage network based on deep learning is proposed. The proposed model consists of Mask R-CNN in first stage for instantaneous segmentation of seed and novel CNN architecture for classification of seed germination. The proposed model is shown in figure 1.

3.1. Seed extraction using Mask R-CNN

The first stage of the proposed model includes the implementation of Mask R-CNN [25] and it is shown in figure 2 and its architecture is shown in table 1. Initially, the captured image of the petri dishes are modified by resizing it to a predetermined size and standardising the pixel values. The data is subsequently transformed into a tensor in order to ensure compatibility with the neural network.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** First stage architecture of the proposed system.
Download figure:
Standard image High-resolution image

The pre-processed image undergoes analysis using a CNN, which serves as the backbone network and has been specifically trained for the purpose of object detection (seed). The neural network employed in this study is designed to extract and analyse high-level features from the input image, thereby effectively capturing and encoding pertinent information pertaining to the visual characteristics of the seeds.

The Region Proposal Network (RPN) [26, 27] examines the extracted features in order to produce a collection of bounding box proposals that may potentially encompass seeds. The aforementioned proposals are acquired through the process of sliding anchor windows across the feature map and subsequently making predictions regarding the presence of a seed within each individual anchor.

The remaining proposals undergo a process of refinement through the elimination of redundant and low-confidence ones. Region of Interest (RoI) pooling is employed to extract feature maps of a consistent size for each refined proposal, with the objective of emphasizing the pertinent regions.

The feature maps of the RoI are subjected to processing via Fully Connected (FC) layers in order to extract features and perform classification. This step involves the assignment of class probabilities to each proposed bounding box, which serves to indicate the likelihood of a seed being present. Additionally, it refines the coordinates of the bounding box in order to align more accurately with the boundaries of the seed.

During the training process, the network parameters are optimised by comparing the predicted bounding box coordinates and class probabilities with the ground truth annotations. This comparison is used to compute various loss functions, such as region proposal loss, classification loss, and bounding box regression loss.

During the process of inference, the model that has been trained employs the predictions of bounding boxes and class probabilities in order to ascertain the existence and precise positioning of seeds within the petri dish. The aforementioned predictions undergo a filtering process that involves the utilisation of confidence scores and Non-Maximum Suppression (NMS) in order to eliminate redundant or overlapping detections. The ultimate result furnishes the precise coordinates of the bounding boxes and the corresponding class labels for the identified seeds.

3.2. Seed germination classification using CNN

The second stage encompasses the proposed CNN model. The purpose of this model is to classify seeds based on their germination characteristics. It consists of a Conv2D layer comprising six filters of dimensions 3 × 3, which yields an output shape of (None, 28, 28, 6) and possesses a total of 896 parameters that can be adjusted during the training process. Subsequently, a MaxPooling2D layer is implemented, utilising a pool size of 2 × 2, thereby yielding an output shape of (None, 14, 14, 6). Next, a Conv2D layer with 16 filters, each having a size of 3 × 3 is appended. This layer produces an output shape of (None, 10, 10, 16) with 2416 toal trainable parameters. A subsequent MaxPooling2D layer is implemented with a pooling size of 2 × 2, which yields an output shape of (None, 5, 5, 16). Subsequently, the output tensor undergoes flattening through the utilisation of the flatten layer, leading to a resultant shape of (None, 120). Finally, a dense layer comprising of 128 units and employing the Rectified Linear Unit (ReLU) activation function is incorporated. This yielding an output shape of (None, 84) with 320 002 trainable parameters.

Table 1. Mask R-CNN Architecture.

Layer	Type	kernel size	Stride	Output size
1	Conv2D	3 × 3	1	224 × 224 × 64
2	BatchNorm	—	—	224 × 224 × 64
3	ReLU	—	—	224 × 224 × 64
4	MaxPooling2D	2 × 2	2	112 × 112 × 64
5	Conv2D	3 × 3	1	112 × 112 × 128
6	BatchNorm	— –	—	112 × 112 × 128
7	ReLU	—	—	112 × 112 × 128
8	MaxPooling2D	2 × 2	2	56 × 56 × 128
9	Conv2D	3 × 3	1	56 × 56 × 256
10	BatchNorm	—	—	56 × 56 × 256
11	ReLU	—	—	56 × 56 × 256
12	Conv2D	—	—	56 × 56 × 256
13	BatchNorm	—	—	56 × 56 × 256
14	ReLU	—	—	56 × 56 × 256
15	MaxPooling2D	2 × 2	2	28 × 28 × 256
16	Conv2D	3 × 3	1	28 × 28 × 512
17	BatchNorm	—	—	28 × 28 × 512
18	ReLU	—	—	28 × 28 × 512
19	Conv2D	3 × 3	1	28 × 28 × 512
20	BatchNorm	—	—	28 × 28 × 512
21	ReLU	—	—	28 × 28 × 512
22	MaxPooling2D	2 × 2	2	14 × 14 × 512
23	Conv2D	3 × 3	1	14 × 14 × 512
24	BatchNorm	—	—	14 × 14 × 512
25	ReLU	—	—	14 × 14 × 512
26	Conv2D	3 × 3	1	14 × 14 × 512
27	BatchNorm	—	—	14 × 14 × 512
28	ReLU	—	—	14 × 14 × 512
29	MaxPooling2D	2 × 2	2	7 × 7 × 512
30	Flatten	—	—	1 × 25088
31	Dense	—	—	1 × 4096
32	Dropout	—	—	1 × 4096
33	Dense	—	—	1 × 4096
34	Dropout	—	—	1 × 4096
35	Dense	—	—
36	Softmax Activation	—	—	1 × 1000

Lastly, the model includes a Dense layer comprising of two units and utilises the softmax activation function to represent the output classes. The layer's output shape is defined as (None, 2), indicating that it produces a tensor with an unspecified number of rows and 2 columns. The proposed CNN model consists a total of 3,39,394 parameters, all of which are shown in table 2. This CNN model is used for classifying seeds as either germinated or non-germinated.

Table 2. Architecture of proposed classification model.

Layer	Output shape	# Parameters
Conv2d6(Conv2D)	multiple	896
Maxpooling 2D6	multiple	0
conv2d7(Conv2D)	multiple	18496
Maxpooling 2D7(maxpooling 2d)	multiple	0
flatten3 (Flatten)	multiple	0
dense	multiple	320002
Total		3,39,394

Once the seeds in the petri dish have been detected using object detection with Detectron2, the subsequent procedure involves the classification of each seed as either germinated or non-germinated. The utilisation of a Convolutional Neural Network (CNN) model can facilitate the attainment of this objective.

The seed regions of interest (RoIs) that have been identified are extracted from the image of the petri dish. The regions of interest (RoIs) encompass the individual seed images. The seed images undergo pre-processing by being resized to a predetermined size and having their pixel values normalised. This process guarantees that the data is standardised to a uniform structure that is appropriate for the CNN model. The seed images that have undergone pre-processing are subsequently inputted into the CNN model. The CNN is composed of several convolutional and pooling layers, which are designed to acquire and identify significant features from the initial images. The extracted features undergo processing in FC layers, which are responsible for performing the classification task. The aforementioned layers facilitate the conversion of features into class probabilities, which serve as indicators of the likelihood of germination for each individual seed. During the training process, the CNN model is trained using labelled data. This labelled data consists of seed images, each of which is associated with a germination label indicating whether the seed has germinated or not.

The parameters of the model are optimised by employing loss functions, such as cross-entropy, with the objective of minimising the discrepancy between the predicted labels and the true labels. During the process of inference, the CNN model that has been trained is utilised to take the pre-processed seed images as its input. The model then proceeds to make predictions regarding the germination status of each individual seed. The classification label (germinated or not germinated) can be determined based on a predefined threshold using the class probabilities obtained from the output layer.

By employing a proposed CNN architecture and utilising a meticulously annotated dataset comprising images of both germinated and non-germinated seeds, it becomes feasible to effectively categorize the identified seeds within the petri dish, thereby furnishing significant insights pertaining to their germination status.

4. Experimental results

The proposed system is a two-stage model utilised for the prediction of seed germination status.

4.1. Dataset collection

A automated growth chamber is setup to collect the seed germination dataset which is shown in figure 3. The dataset presented in this article pertains to the germination of seeds in a growth chamber [28]. During which they underwent individual germination over a period of forty eight hours. The experimental setup comprises a germination tray with 4 petri dishes, each containing more than ten seeds as shown in figure 4(a). Automated system is employed to regulate and sustain consistent levels of temperature, humidity, moisture, and light which indeed helps the seeds to germinate. Figure 4(b) displays a sample image in our dataset obtained through an automated embedded system. The seeds are captured using an Raspberry pi [29] and camera and subsequently stored in database at regular intervals for 5 minutes. The aforementioned procedure is iterated over a span of 3-4 days, depending upon the duration required for seed to be germinated. The collected seed data is annotated to provide a ground truth image for the purpose of training a deep learning model. The collected dataset distribution is mentioned in the table 3.

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** (a) Automated system for seeds data collection. (b) Raspberry Pi [28] board collecting images and displaying temperature and Humidity. (c) Germinated seeds tray with temperature and humidity sensor. (d) Maintaining constant temperature and Humidity and display on touch screen.
Download figure:
Standard image High-resolution image

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** (a) Model to collect images of seeds (b) Collected image by proposed model.
Download figure:
Standard image High-resolution image

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** Example of generated seed dataset (a) seeds image (b) mask image tile of seed labels.
Download figure:
Standard image High-resolution image

Table 3. Data set.

S.No	Number of images (Seeds)	Training	Testing	Total
1	Stage 1	572	147	719
2	Stage 2	858 (Germination), 858 (No Germination)	240	1955

The dataset encompasses the germination process of red gram seeds through the utilization of an automated camera system. This system was programmed to record photos at regular intervals of 5 minutes. Thus providing a granular view of the germination process. This repeated imaging displays a thorough path from initial swelling to radicle appearance and subsequent expansion. Quick intervals enable the early identification of germination and any anomalies, allowing for a fast response or analysis. Furthermore, the resulting rich dataset is invaluable for temporal research and improves the efficacy of deep learning models by providing a large number of data points. Overall, such a thorough monitoring system provides critical information for both immediate observations and sophisticated analytical procedures.

4.2. Data annotation

During annotation process a seed label is assigned to an image frame for the purpose of identifying the seed in a petri dish. It is done using robo flow tool [30]. This process is illustrated in figure 6.

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Annotating the seed dataset using robo flow tool. (a) Petri dish (b) Petri dish with seeds (c) Seeds without petri dish (d) Individual seed annotation.
Download figure:
Standard image High-resolution image

4.3. Experimental setup

The proposed model is evaluated using the Pytorch framework on a Tesla V100-SXM2 GPU with CUDA version 12.0. The Mask R-CNN is implemented the Detectron2 framework with a suitable backbone network architecture and conducted on the collected dataset. The mask images with seed labels are generated from the seed dataset as shown in figure 5.

4.4. Evaluation metrics

In order to assess the effectiveness of our proposed two stage network in predicting seed germination six widely used metrics, namely pixel accuracy, Intersection of Union (IoU), precision, recall, and F1 score are employed. The pixel accuracy refers to the proportion of pixels in the image that have been accurately segmented. The term 'IoU' denotes the intersection between predicted masks and ground truth masks. Precision is the ratio of successfully extracted masks to all anticipated masks, whereas recall is the ratio of correctly recognised masks to ground truth. Finally, the F1 score is defined as the harmonic mean of precision and recall. All these metrics are summarized using below equations.

$\begin{eqnarray}&&{Pixelaccuracy}=\displaystyle \frac{{TP}+{TN}}{{TP}+{FN}+{TN}+{FP}}\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&{IoU}=\displaystyle \frac{{TP}}{{TP}+{FP}}\end{eqnarray} \tag{ 2 }$

$\begin{eqnarray}&&{Recall}=\displaystyle \frac{{TP}}{{TP}+{FN}}\end{eqnarray} \tag{ 3 }$

$\begin{eqnarray}&&F1-{score}=\displaystyle \frac{2{TP}}{2{TP}+{FP}+{FN}}\end{eqnarray} \tag{ 4 }$

Where, True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) indicate scenarios in which the model correctly forecasts the positive class as positive (i.e., TP) or negative (i.e., FN) and the negative class as positive (i.e., FP) or negative (i.e., TN).

4.5. Results

The Mask R-CNN model in the first stage of proposed model is trained for 3000 epochs and CNN in the second stage is trained for 50 epochs. The collected dataset is partitioned in 80:20 ratio for training and testing purpose. Stage-wise training performance of proposed model is shown in figure 7 and figure 8. The input and outcome of stage 1 for different input images is shown in figure 9. Next, bitwise 'or' operation is performed between stage 1 output and binary mask is performed to extract region of interest which is shown in figure 10. This output output is feeded as input to stage 2 to classify seed germination status. Such a proposed model is tested using test dataset compared with state-of-the-art models in terms of accuracy, IoU, precision, recall, and F1 score. These results are furnished in the table 4 and best values are highlighted.

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.** Stage-1 learning curve.
Download figure:
Standard image High-resolution image

Figure 8. Refer to the following caption and surrounding text. — **Figure 8.** Stage-2 learning curve.
Download figure:
Standard image High-resolution image

Figure 9. Refer to the following caption and surrounding text. — **Figure 9.** Models for detecting masks (illustration in a segmentation of instances).
Download figure:
Standard image High-resolution image

Figure 10. Refer to the following caption and surrounding text. — **Figure 10.** (a.) Automated system for seeds data collection (b.) Raspberry Pi board collecting images and displaying temperature and Humidity (c.) Germinated seeds tray with temperature and humidity sensor (d.) Maintaining constant temperature and Humidity and display on touch screen.
Download figure:
Standard image High-resolution image

Table 4. Evaluation metrics of the model with different pre-trained models.

Model	Pixel accuracy	IoU	Precision	Recall	F1
Proposed Model	0.88	0.77	0.89	0.85	0.87
ResNet50	0.84	0.75	0.79	0.94	0.86
Inception	0.85	0.45	0.49	0.85	0.88
LeNet	0.82	0.70	0.79	0.94	0.82

4.5.1. ResNet50

It a member of the 'Residual Networks' family, is a deep neural network architecture known for its 50 layers and novel skip connections, which are aimed to optimize deep model training. The structure begins with a 7 × 7 convolutional layer, followed by a max-pooling layer. ResNet50 is made up of 16 residual blocks, each with a three-layer 'bottleneck' architecture that comprises a sequence of 1 × 1 and 3 × 3 convolutions. These blocks are linked together via skip or shortcut connections that allow certain layers to be bypassed, resulting in fast and reliable training for such a deep model. For classification problems, the design culminates in an average pooling layer and a fully linked layer with softmax activation. ResNet50's working principles focus around 'residual learning,' in which the network learns the difference (residual) between its input and the desired output. This method, when combined with skip connections, successfully mitigates the vanishing gradient problem that is typical in deep networks. Batch normalization, which is applied after practically every convolutional operation, standardized layer activations, facilitating stable and fast training. The model assures non-linear transformations by making significant use of the ReLU activation function. The original design of ResNet50 has reinforced its place as a cutting-edge model, inspiring several deep learning studies and following designs.

4.5.2. Google inception

The Google Inception model, also known as GoogLeNet, revolutionized convolutional Neural Networks (CNNs) with its 'Inception module.' This core module employs different filter sizes (1 × 1, 3 × 3, 5 × 5) simultaneously, successfully collecting pictures with changing spatial hierarchy. Dimensionality is lowered to limit computing expenses by 1 × 1 convolutions known as 'bottleneck layers.' Along with these convolutions, each Inception module includes a parallel max-pooling layer, the outputs of which are concatenated in depth. Instead of the typical fully connected layers, GoogLeNet uses global average pooling to reduce overfitting, and it includes two auxiliary classifiers in intermediate layers to help with gradient propagation. Through its different filter sizes, the Inception model excels at recognising both broad patterns and tiny details in pictures in terms of its working principles. The embedded 1 × 1 convolutions, sometimes known as 'network-in-network' layers, not only provide non-linear transformations but also significantly reduce dimensions, speeding subsequent calculations. Auxiliary classifiers contained inside the framework enable regularization, reducing the danger of overfitting. Furthermore, GoogLeNet prefers stride 2 convolutions over aggressive pooling to minimize grid sizes and preserve picture information. GoogLeNet has firmly entrenched its position in deep learning history with its revolutionary structure and principles, providing the groundwork for future designs.

4.5.3. LeNET

The architecture consists of seven levels, beginning with a 32 × 32 input layer. The seed image data is then processed and downsampled using alternating convolutional layers (C1, C3, C5) and pooling layers (S2, S4). The network is completed with a fully connected layer (F6) and an output layer designed for 10-class classification, similar to the MNIST dataset. Convolutional layers build feature maps using kernels, whereas pooling layers use average pooling to minimise spatial dimensions. The LeNet algorithm employs a series of convolutions, subsampling, and fully connected processes to get final classification results. Non-linearity is introduced within the network layers using activation functions, typically tanh or sigmoid. Backpropagation is used to train the LeNet model, which is supplemented with gradient descent optimization. Despite the fact that modern models have become more complicated, LeNet's pioneering architecture has cemented its historical relevance, paving the way for the widespread use of convolutional neural networks in a variety of applications.

Proposed model demonstrates a pixel accuracy of .88, denoting that 88% of the pixels in the predicted masks correspond with the ground truth masks. When compared with state of art models. ResNet50 demonstrates a pixel accuracy of 0.84, whereas LeNet exhibits a pixel accuracy of 0.82 and Google Inception pixel accuracy is 0.85. The Intersection over Union (IoU) metric reveals that proposed model exhibits an IoU value of 0.77, signifying a substantial degree of overlap between the predicted masks and the ground truth masks. ResNet50 demonstrates an Intersection over Union (IoU) value of 0.75, whereas LeNet exhibits an IoU value of 0.70 and Inception model has IoU value of 0.45 The precision score of proposed model is 0.89., signifying that 89% of the predicted positive instances are accurately classified. ResNet50 demonstrates a precision score of 0.79, whereas LeNet exhibits a precision score of 0.79 and google inception is of 0.49. The recall score of proposed system is 0.85, indicating that it correctly identifies 85% of the actual positive instances. ResNet50 demonstrates a recall score of 0.94, whereas LeNet exhibits a recall score of 0.94 and google inception is 0.85. Proposed system exhibits an F1 score of 0.87, which represents the harmonic mean of its precision and recall metrics. ResNet50 demonstrates an F1 score of 0.86, whereas LeNet exhibits an F1 score of 0.82 and google inception is 0.82.

Based on the assessment criteria employed, it can be concluded that proposed model exhibits greater efficiency compared to the other three models. It consistently attains superior performance in terms of pixel accuracy, Intersection over Union (IoU), precision, recall, and F1 score.

Further, the proposed model in compared in terms of model complexity with state-of-the-art models. These results are furnished in table 5 and best values are highlighted. From the results it is observed that proposed model achieved on-par performance (88%) with Google inception by reducing model complexity drastically by 98.5% which is significant move.

Table 5. Hyper parameters for proposed model and pre-trained models.

S.No	Model	Total parameters
1	Proposed Model	3,39,394
2	LeNet	61,326
3	ResNet50	24,637,826
4	Google Inception	23,903,010

4.6. ROC curve

The Receiver Operating Characteristic (ROC) curve is commonly employed in the assessment of the efficacy of binary categorization models. This diagram depicts the correlation between the True Positive Rate (TPR) and the False Positive Rate (FPR) across different classification thresholds. To compare proposed model with ResNet50, LeNet and InceptionNet ROC curves are plotted graphically and shown in figure 11 roc. From the figure it is observed that our proposed model balances TPR and FPR when compared to other state-of-the-art models.

Figure 11. Refer to the following caption and surrounding text. — **Figure 11.** ROC curve.
Download figure:
Standard image High-resolution image

5. Conclusion

In agricultural domain seed germination is one of the main step to assess the seed quality. It is used by seed industries, seed breeders and agricultural research institutes for production of high quality seeds. However, the current research uses image processing techniques for seed germination and are mostly manual and labor intensive. Hence, a automatic monitoring system is essential to speed up the testing process. In this paper a novel approach is proposed for the automatic seed germination classification. The proposed system uses Mask-RCNN in first stage for seed segmentation and CNN in second stage for classification.

The proposed novel two-stage network leverages various CNNs to automate detection of seeds and the assessment of their germination state. In the first stage Mask R-CNN framework is used for instantaneous segmentation of seeds and in the next stage this Region of Interest (RoI) is given as input to the proposed CNN model for germination prediction.

The proposed model is trained and tested against own collected dataset. The proposed model got best performance in term of accuracy, IoU, Precision when compared with ResNet50, Inception Net and LeNet models. Moreover, the proposed model achieved on-par performance (88%) with Google inception by reducing model complexity drastically by 98.5% which is significant move.

Data availability statement

The data cannot be made publicly available upon publication because they are owned by a third party and the terms of use prevent public distribution. The data that support the findings of this study are available upon reasonable request from the authors.

Declaration of conflict of interest

The authors state that they have no known conflicting financial or personal interests that may have seemed to affect the work presented in this article.

Ethics statement

Not applicable

Dates

Peer review information

4.5.1. ResNet50

4.5.2. Google inception

4.5.3. LeNET

SeedAI: a novel seed germination predictionsystem using dual stage deep learning framework

Author notes

Notes

Article metrics

Submit

Share this article

Dates

Peer review information

Abstract

1. Introduction

2. Literature review

3. Proposed methodology

3.1. Seed extraction using Mask R-CNN

3.2. Seed germination classification using CNN

4. Experimental results

4.1. Dataset collection

4.2. Data annotation

4.3. Experimental setup

4.4. Evaluation metrics

4.5. Results

4.5.1. ResNet50

4.5.2. Google inception

4.5.3. LeNET

4.6. ROC curve

5. Conclusion

Data availability statement

Declaration of conflict of interest

Ethics statement