A Review of Artificial Intelligence Methods for Seed Quality Inspection based on Spectral Imaging and Analysis

. The quality inspection of seeds is of great importance to agricultural production. Traditional detection methods, such as chemical treatment, hollow determination, hyperosmotic germination, electrophoretic analysis, are time-consuming and inefficient. On the contrast, spectral imaging combined with image processing is non-destructive, which is promising in rapid quality inspection of seeds. This review analyzes the characteristics of spectral analysis as well as spectral imaging technology. The advantages of applying near-infrared spectroscopy and spectral imaging technique in quality inspection of seeds and their current application status is introduced, and the research progress of image processing algorithms is described.


Introduction
As the global population grows, the demand for agricultural products is increasing rapidly and closed related to the quality of seeds. Therefore, it is of great significance to strengthen seed quality testing during seeds planting, producing, transportation and selling. Crop seed quality inspection includes seed purity, moisture and lipid content, seed vigor measurement, etc. Traditional artificial seed quality inspection methods are performed by manually comparing seeds according to their size, desiccated degree, color difference after chemical treatment, hollow ratio and other information. Such methods are time-consuming and inefficient, which cannot meet the needs of modern agricultural development. Using modern science and technology for quality control of seeds is a scientific problem in agricultural development and increasing grain production. In recent years, the equipment and technology for crop seed quality inspection have been developed rapidly with the science development. Therefore, the quality inspection method combining spectral image processing and artificial intelligence emerges and makes the quality inspection of agricultural products more automatic and intelligent. Among these methods, infrared spectroscopy and hyperspectral imaging have been widely concerned by scholars at home and abroad due to their real-time and efficient characteristics.
Spectral analysis technology could obtain the spectral characteristic information of the tested object [1], which has the advantages of easy operation, fast and nondestructive inspection, as well as

Spectral analysis technique
The radiation energy intensity that a substance absorbs or emits is proportional to the content of the substance. Spectral analysis technology is a kind of analysis method established based on this optical property, which can conduct qualitative or quantitative analysis of the substance [10]. According to the different bands, the spectrum used by spectral analysis technology can be divided into visible/nearinfrared spectrum and mid-infrared spectrum. The spectral bands of visible light, near-infrared spectrum and mid-infrared spectrum are 400~780nm, 780~2500nm, and 2500~25000nm respectively. When irradiated by infrared light, the groups contained in the molecules of the irradiated material (such as C-H, N-H, O-H groups, etc.) will choose to absorb photons to generate vibration and then jump from the ground state to the high energy level [11]. Transitions in the near-infrared spectrum are called frequency multipliers while transitions in the mid-infrared spectrum are called molecular vibrational rotations. The moisture content of seed is different from the molecular content of other organic compounds, so the selective absorption spectrum of the seed can be used to analyze the composition of the substance and its content. Infrared spectroscopy has long been used in the analysis of agricultural products. Early in 1960s, K.H. Norris [12] first applied near-infrared spectroscopy to analyze the moisture content of grains and seeds. With the invention of analytical instruments, the improvement of statistical methods, and the development of computer software technology, the analysis of grains and seeds by infrared spectroscopy is becoming mature [13].
The working principle of the Near-Infrared Ray spectrometer is shown in Figure 1 [14]. The light emitted by the light source passes through the monochromator and gets infrared light of a certain frequency. After focusing infrared light on the analyzed seed sample, the light transmitted or reflected by the seed is recorded by the transmittance or reflectivity detector to obtain the spectrum reflecting the sample information. Since spectral analysis is an indirect analysis technique, the correlation model between the spectral information and the attributes to be tested should be established in the analysis of the spectrum of seeds [15], such as Partial Least Square Regression (PLSR), Support Vector Machine (SVM), etc.

Spectral imaging technique
The 3D data cube obtained by spectral imaging is shown in Figure 2 [16]. Spectral imaging uses an imaging spectrometer to obtain a three-dimensional data cube (x, y, λ), where (x, y) represents twodimensional image information, including the shape, size, texture, and other external features of the target detection object. λ represents a one-dimensional spectrum of information that contains the internal characteristics of a target object, such as chemical composition and structure. Each pixel block in the image records vector information of different wavelengths.

Figure 2. Illustration of hyperspectral imaging near infrared spectroscopy
According to the different number of bands, spectral imaging technology is mainly divided into multi-spectral imaging and hyperspectral imaging. The image obtained by multi-spectral imaging is composed of 3~12 discrete bands, while the image obtained by hyperspectral imaging is composed of 100~200 continuous bands. The data volume of multi-spectral image is small, and the processing speed of multi-spectral image is higher. Spectral image processing is faster, so it is more suitable for real-time online detection. Hyperspectral images have higher light resolution and contain more sample information, but they usually carry some redundant information, so data dimension reduction is required. The hyperspectral imaging system is shown in Figure 3 [17]. The seed sample is placed on the black sample table to facilitate background segmentation in the image processing process. The sample table is mounted on the moving control table. When the seed sample on the sample table passes through the observation slot of the CCD camera at a constant speed, the imaging system acquires images row by row.  Figure 3. The hyperspectral imaging system The amount of spectral information data is huge, which may contain redundant information such as background and noise besides the information of the detected object. Therefore, image feature extraction of hyperspectral images is often used as a pre-processing process for image classification, mixed pixel decomposition, abnormal target detection and other hyperspectral applications [4]. Common high-dimensional data variable screening and feature extraction algorithms include principal component analysis (PCA), linear discriminant analysis (LDA), isometric feature mapping (ISOMAP), stepwise regression (SR), partial least squares (PLS), successive projection algorithms (SPA), uninformative variable elimination (UVE), genetic algorithm (GA), competitive adaptive reweighted sampling (CARS) [18]. Methods for establishing the relationship model between spectral data and the indexes to be tested mainly include Multi Linear Regression (MLR), artificial neural network (ANN), Support Vector machine (SVM), Partial least Squares Regression (PLSR) etc., among which PLSR and ANN are more widely used algorithms [10] 3. Application status of spectral image in seed quality inspection

Purity test of seeds
It is inevitable that the seeds will be mixed with other seeds or other impurities such as gravel during harvest. Rapid seed quality identification is beneficial to improve seed purity, ensure crop phenotypic consistency and crop yield. Different varieties of rice have different reflectivity intensity, so the hyperspectral samples of rice with different adulteration degree are separable and suitable for modeling. Sun et al. [19] mixed two kinds of rice according to 5 proportions, and obtained spectral images of 200 rice samples by using hyperspectral imaging system. In their work, principal component analysis (PCA) was used to reduce the dimension of hyperspectral data, and 6 characteristic wavelengths and 9 optimal principal components were selected. The detection of rice adulteration was modeled by support vector machine (SVM). The accuracy of rice classification reached more than 93% in the modeling experiment, which indicated that the hyperspectral image technology was feasible to test the problem of rice adulteration. Consumer demand for vegetable oils rich in unsaturated fatty acids (MUFA) and polyunsaturated fatty acids (PUFA) has increased in recent years due to concerns over diet balance and health. However, some natural or man-made adulterations not only reduce the proportion of beneficial ingredients in vegetable oils, but also pose health risks. In order to determine whether mustard seed is doped with thistle poppy seed, Rahul et al. [20] first combined attenuated total reflection Fourier transform infrared spectroscopy with multivariate stoichiometry respectively, and used PCA dimensionality reduction, LDA (Linear Discriminant Analysis) classification, PCR (Principal Component Regression) and PLSR to model the map features, and proposed an express algorithm to detect the purity of mustard seed. The experiment shows that the calibration model using PSLR (Partial Least Square Regression) has the best effect when the wavelength is 5555~20000nm, and the relative prediction error is 0.033. The minimum detection level of this algorithm is 1% V/V, that is, as long as the adulteration concentration of mustard oil is not less than 1%, it can be detected by the mustard oil purity detection algorithm, and the predicted root-mean-square error is 0.2, indicating that the spectral image combined with PLSR modeling has advantages in purity detection.

Test for moisture and lipid content of seeds
Water content is an important factor in evaluating seed quality and storage status. Too high water content of rice will affect the taste of rice, and is not conducive to processing, storage and transportation, and easy to mildew. To detect the water content of brown rice, Heman et al. [21] used multiple linear regression (MLR) and partial least squares regression (PLSR) to model the visible/near-infrared spectrum of brown rice, respectively. The experiment showed that the PLSR calibrated model could well predict the water content of brown rice, and its validation set correlation coefficient was 0.92. Lin et al. [22] designed and developed a rapid detection sensor for rice seed water content based on nearinfrared spectrum, which can realize real-time online measurement of rice seed water content. In the experiment, PLS, CARS (Competitive Adaptive Reweighted Sampling) and SLR (Least Square Regression) were used to model and analyze the spectral information, and it was concluded that the wavelength of rice spectrum at about 1450nm was the most sensitive to the detection of water content. However, the photoelectric conversion signal used in the detection device is light-emitting diode, and the output signal is easily affected by the ambient temperature and humidity, so the detection accuracy needs to be improved. Caporas et al. [23] first used hyperspectral prediction model to model coffee beans to analyze the moisture and lipid content of whole coffee beans, and the prediction error was 0.28% and 0.89% respectively. In order to study the influence of near-infrared spectroscopy and hyperspectrum on wheat quality detection, Wu et al. [24] established the optimal near-infrared spectroscopy model and the optimal hyperspectral model of water content of wheat seeds. By comparison, the indexes of the hyperspectral imaging model are better than those of the NIR spectral analysis model, and its accuracy and predictive ability are higher, which can be well applied to wheat seed quality evaluation.

Vitality test of seeds
Seed vigor refers to the ability of seed to arch the soil and sprout. The seed with high vigor can withstand storage and has obvious growth advantage, which is of great economic significance to increase agricultural production. Li et al. [25] first studied the selection of hyperspectral characteristic bands of sweet corn by using FIPLS (Forward interval Partial Least Square), CARS and UVE (Uninformative Variable Elimination) variable screening method, and proposed a fast screening method for high-vigor corn seeds. The near infrared diffuse reflectance spectra of sweet corn seeds aged by stages at high temperature were collected by this method, and quantitative models of germination rate, germination index and vigor index were established by partial least square regression (PLSR). After testing, the cross-validation correlation coefficients of the three models are 0.894, 0.940 and 0.871 respectively, indicating that this method is reliable for the determination of the three parameters at one time. In order to solve the problem of seed damage in the existing grading process, Peng et al. [26] collected the hyperspectral images of tomato seeds, extracted the characteristic wavelength images by using successive projections algorithm (SPA), obtained the grading threshold value of the correction set in combination with the standard germination test, and then used the threshold value to discriminate and grade the verification set. Through testing, the seed classification algorithm has the highest accuracy under 713nm wavelength, the accuracy of correction set is 93.75%, and the accuracy of validation set is 90.48%. At present, the seed vigor detection method based on spectral image can effectively analyze the seed vigor, but the seed vigor level cannot be accurately classified.

Research status of image processing algorithm
With the development of image processing technology, more and more researchers begin to study the artificial intelligence method based on image engineering to realize the rapid automatic detection of seed quality. As shown in Fig. 4 [27], image engineering integrates various technologies of image processing, image analysis and image understanding into an overall framework. The result of image processing can be used as the input of image analysis or image understanding, and the preliminary result of image analysis or image understanding can also be used to improve and perfect image processing. In practical operation, noise interference, seed adhesion and other problems bring many difficulties to the research. Therefore, the most difficult task of agricultural product seed image processing algorithm is to remove the image noise and segment adhesive seeds.

Image denoising
The purpose of image denoising is to remove the polluted noise data in the image to improve the image quality. Common noise types include salt and pepper noise, additive Gaussian noise [28], etc. Existing image denoising algorithms include Gaussian smoothing filter [29], median filter [30], anisotropic diffusion filter [31], total variational model (TV) [32], non-local mean filter (NLM) [33], etc.
Li et al. [34] took 40 as the gray level threshold for binarization processing of cucumber seed images, and then used the open operation of morphological processing to remove the noise of binary images. Some denoising method for Gaussian noise should set standard deviation σ artificially, which would lead to the problem of missing great details in image. For this problem, Li et al [35] proposed an adaptive gaussian filtering algorithm, and used this algorithm to denoise the RGB image of cucumber with blight spot. It turned out that the improved Gaussian filtering algorithm, compared to the traditional Gaussian filter, improves the peak signal-to-noise ratio (PSNR) of denoised images by 13.8% on average. Liu et al. [36], in order to solve the problem of diffusion tensor image (DTI) generating artifacts or blurred edges, used anisotropic filtering and Riemann framework, combined with complex shear transformation to effectively remove the noise of medical images. Manoj et al. [37] proposed a total variational correction method to suppress noise in computed tomography (CT) images and designed a numerical algorithm to solve the problem of minimizing the exponential directional weighting function in the energy density function by Split Bhagman method. The results show that running time of the algorithm is 3.0195s, and it can restore the visually acceptable CT images in a relatively short time compared with the existing total variational algorithm. Li [38] proposed a nonlocal switchable filtering algorithm for the problem of high-intensity salt and pepper noise by taking advantage of non-local structural similarity. Compared with the classical median filter algorithm, NLM algorithm and BM3D algorithm, this algorithm has better processing results.

Image segmentation
The purpose of image segmentation is to extract the regions of interest in the image, which is an important part of the transition from image processing to image analysis. At present, the research methods in the field of image segmentation include threshold based segmentation [39], edge based segmentation [40], region based segmentation [41], graph theory based segmentation [42], energy functional based segmentation [43], clustering based segmentation [44], neural network based segmentation [45], etc.
In order to calculate the number of rice grains, Xu et al. [46] obtained the gray-scale histogram of rice ray images with the highest discrimination degree between rice ray image and background through voltage experiment, and obtained two minimum values as threshold values of the histogram fitting curve after smoothing, and segmented the image into background, solid seed and interspecific coincidence area. Guo et al. [47] proposed a feature point detection method with improved accelerated segmentation test feature (FAST) in order to achieve fast and accurate segmentation of different adhesion particles. This algorithm preprocesses the image to obtain binary image, and then obtains edge detection points according to using the Gaussian Laplace edge detection traversal image, and combined this algorithm with the Watershed algorithm based on h-Maxima transformation to conduct segmentation experiments for adhesion corn kernels, glass beads, sand grains and salt grains. The show that this method greatly compensates for the shortcoming that watershed algorithm is easy to over-divide, and the segmentation accuracy is about twice of the traditional watershed algorithm, both higher than 95%. In terms of time efficiency, the average operation time of this method is 1/3~1/2 of that of other improved watershed algorithms, which fully improves the operation efficiency of the algorithm. Jiang et al. [48] proposed a three-stage particle segmentation method according to the characteristics of fuzzy boundary between mineral particles and the idea of semantic segmentation. As shown in Figure 5, the algorithm is divided into three stages: generating super-pixel set, extracting super-pixel semantic features, and region merging. In stage 1 and 2, the convolutional neural network with tagged sandstone particle image training is used to extract the semantic features of super pixels; in stage 3, multi-feature clustering is used to combine super pixels to generate particle segmentation results. It had been tested that this method can effectively segment sandstone images. Considering the feature of uneven distribution of image gray level, Han et al. [49], combined the energy functional of local gray level entropy information with non-convex regular term constraints, obtained satisfactory segmentation results of retaining edge information.

Discussions
In recent years, with the development of optical detection, image processing and data analysis, the seed quality inspection method based on spectral image has made good progress. Table 1 lists the basic application results of spectral images in seed quality inspection. In general, spectral image combined with artificial intelligence has a good application prospect in the detection of seed purity, water content, lipid and vitality. For different seed or detection characteristics, spectral analysis needs to select appropriate spectral data feature wavelength extraction method and modeling analysis method. As shown in Table 1, the current mainstream feature wavelength extraction methods include PCA, PLS, CARS, FIPLS, SPA, etc. And PLSR is the most effective and widely used modeling and analysis method.
In order to better analyze the spectrum, it is still necessary to find an objective and effective band selection method and establish more correlation models between spectral information and characteristic attributes. Due to the high cost of spectral detection equipment, the current seed quality inspection based on spectral image is still in the laboratory stage, and the quantified production of equipment has not been put into practical application. In addition, many studies have only studied the detection model of one kind of seed, and further experimental exploration and verification are needed for the stability and transitivity of the model. The algorithm to improve image denoising and segmentation has also made good progress, but there are still some problems. On the one hand, the processing effect of image edge blurring and information loss still has room for improvement. On the other hand, how to improve the precision and reduce the complexity of the algorithm is a problem to be studied The application of artificial intelligence method combining spectral image processing and data analysis provided a significant improvement in seed quality control. It has the advantages of nondestructive, fast and real-time online detection, and shows great potential in agricultural modernization. Based on the current research status of the application of spectral image and image processing technology in seed quality inspection, this paper proposes the following prospects: 1) Improve the efficiency of image denoising and segmentation algorithm, and reduce the running time of algorithm while obtaining high-accuracy segmentation image;

Conclusions
2) For the mass data of spectral information, it is vital to select the appropriate dimension reduction method to reduce the amount of data processing while ensuring the integrity of information, so as to improve the efficiency of data processing; 3) A variety of seeds could be selected for seed quality analysis to improve the stability and transmission of detection model; 4) As for seed vigor detection, attention could be paid to seed vigor classification. And our team intends to measure the bud length and root length of seed germination images by spectral technology combined with image processing, and establish the relationship model between seed vigor and bud length and root length, so as to analyze the vigor level of seeds.