Random Forest (RF) based identification of rice powder mixture using terahertz spectroscopy

Adulteration is a severe problem in agriculture field. It may cause some dishonest traders in the marketing side. For their own wish, dishonest traders make adulteration products for higher profit. In consumer side, agro product adulterations provide some unhealthy situation. It may cause severe health problem and also some incurable disease. Its right time, need to find an adulteration various level of agro products. For detecting rice adulteration, to collect spectral data for mixed rice powder (low quality rice powder and high quality rice powder), and a Fast Independent Component Analysis (FICA) algorithm is used to extract valuable information. Then, apply Random Forest Classifier algorithm for classifying purpose, whether the mixture is low quality or high quality. It achieves high accuracy with a single model and also easy to implement.


Introduction
Rice is one of the important agro products in the world. Most of the peoples are consuming rice without known their quality of rice. Some peoples are consuming rice as a raw form and other peoples are consuming as a rice as powder form for preparing own food. Here the problem is, where the peoples are consuming rice in powder form really don't know about the quality of rice powder. All the rice powders are in white color i.e. low quality rice powder and high quality rice powder. It's very hard to finding adulteration rice in powder form, these kinds of adulterations not visible in eyes. Contamination in food, both commercially and extra income, is unhealthy and life threatening. Rice is an unspeakable ingredient in everyday foods. This creates an environment where many people get sick. Thus, consumers are failing in many places for finding adulterated product. Product managers and sellers of various products accomplish this by deceiving people into selling their substandard, counterfeit food by giving what they owe to authorities. Some dishonest traders are knowingly selling their adulterated product for harm the physical health of the people. In our daily lives, a variety of additives are added to the food items buy in stores that cannot find. This admixture is performed in such a way that cannot even tell the difference between the good product and the adulterated product. Unknowingly pay for it, eat it, and suffer from various form of disease. Currently mixing 'plastic' rice with rice and fake eggs is painful. The fact is that the culture of the mixture does not change, except that the substance changes with the passage of time. Can food be adulterated like this? That is shockingly adulterated without changing the appearance of the products. In this methodology really helpful for the rice powder mixed with their low quality rice powder. Foods that rely on to buy in stores contain a variety of additives that cannot find with our eyes. How do these traders, who play with people's lives to make money in the wrong way. In this way, it is painful for a few to see money being stuffed into products for profit. Food adulteration did not come yesterday, today; it has been going on for many years. Starting with rice they mix and sell substandard rice, small grains, soil, paddy and bran with rice. The advent of plastic foods in this age of technology has also caused great fear. These plastics, even if they are in the soil, can cause insomnia, disrupting the movement of the esophagus and up to the digestive organs, causing life-threatening atrocities. This can lead to abdominal pain, indigestion, vomiting and, if left untreated, can lead to cancer, kidney disease, heart attack, and vascular rupture.
In early stage, adulteration of rice powder identification is based on color and texture. This method is one of the traditional methods for identification of rice mixture. Based on this methodology couldn't find exact adulteration ratio of rice powder. This method, find whether the rice to be adulterated or not. The environmental factors also affect the results of adulteration in rice powder. After so many years, chemical methods were used to find adulteration of rice powder. This method involves complex pre-treatment and also expensive equipment. It also couldn't find exact adulteration ratio of mixed rice powder. It's very hard to implement, and also equipment setup cost is too high. Chemical method is very complex method and lengthy process for finding adulteration ratio of mixed rice powder. Now a day, Spectral technology used to find adulterated agro products. This methodology is very popular for finding adulterated products. This spectroscopy method predicts the adulterated level of mixed rice powder based on surface liquid content, along with near-infrared (NIR) spectroscopy [1]. Providing better accuracy NIR combined with a Back propagation neural network was used. However, this method doesn't suits for long wavelength band spectrum for rice powder. In this method, long wavelength spectrum is still unknown. In this spectroscopy method the information found from spectral response in terahertz band. This terahertz band information can't obtain from other traditional method. Thus, the terahertz time-domain spectroscopy method is effective tool for finding adulteration in various levels. Wenwen, et.al. (2013) has proposed to identify the rice seed cultivars using NIR spectroscopy. In this study [16], NIR spectral data are used as input. Partial Least Squares Discriminant Analysis (PLS-DA) algorithm used to take out the information for classification purpose. Here, three classifier are used and provide the comparable result one with another. PLS-DA algorithm with KNN classifier gives 80% accuracy. In another side, SIMCA with Random forest provides the 100% accuracy. Anyway it's not suitable for longer wavelength spectrum.

Related work
Jianjun, et.al. (2014) has proposed to identify the transgenic cotton seed based on GA-SVM. In this study, [17]Terahertz spectral data are used as an input. Principle component analysis is used for extracting corresponding feature. Here, SVM classifier is used for classification purpose. By using this method the rate of accuracy will be 96.67%.For improving accuracy, PCA is replaced by FICA.  Saritha , et.al. (2017) has proposed to identify the adulteration of papaya seed in black pepper. In this study [6] the real time images are captured by camera. After that, pre-processing is used for remove unwanted noise from input image. Some features are extracted for finding accurate classification result. In this study, for the feature selection purpose Contrast, texture, shape features are used. Finally, KNN classifier is used for classifying the result of adulteration in black pepper. In this method the accuracy rate will be 90%.

System implementation
In this study, there are four steps [1] involved for finding various level of adulteration in mixed rice powder. In Figure 1, Step 1 -Collecting spectral data • Different level of spectral data to be collected for mixed rice powder. • In this study, random forest classifier is used for classification purpose. Based on the above calculated value the samples are already well trained. • After that, the testing spectral samples are compared with their trained sample. These four steps are important for calculating the various level of adulteration in rice powder. It's very effective method for finding adulterating different kind of mixed rice powder.

Spectral data collection
In this study, standard data base images are used under some standard criteria [1]. Before capturing terahertz data, various rice samples are grind into powder form.

Fast independent component analysis
In existing system Principle compound analysis used for extracting feature value with clustering based on their requirement. Using PCA doesn't provide the exact principle compound value. So, in this study Fast Independent compound analysis used for extracting feature. It provides the exact component value for each clustering component. And also it's quite faster than PCA.

Feature extraction
The feature extraction step function to discover the various features that represent best adulteration level of mixed rice powder. Here, various feature techniques used such as temporal and spectral.

Temporal feature
In feature extraction, temporal feature is one of the important features. Most of the real time images are taken at the different time. Correlations among the images are used to calculate the every change of images in dataset. Various temporal features are there for analyzing images in dataset. Temporal features are used to extract the valuable information such as amplitude, energy based on time domain analysis. This feature calculates the various images in dataset whether it's associate with time or changes over the time. In this study, the temporal features such as mean, Variance, Skewness are considered.

Spectral Feature
Spectral feature also very important feature for extracting valuable information from dataset. This feature provides the frequency domain metrics for each image in dataset. It's easy to establish whether the data value in power spectrum value or spectral value. Here some spectral features are spectral centroid, spectral spread, spectral flux, spectral flatness measure and Chroma. In this Study, the Spectral features such as centroid, Spectral Variance considered.

Random forest classifier
In this study, random forest classifier is used for classifying various level of adulteration in rice mixture. Random forest, works under tree like structure.  In this classifier, multiple trees form a forest. Here, each tree of forest provides the votes for corresponding input criteria. Random forest classifier method is a fast and also provides accurate result based on their training sample data. Figure 3) • Let the number of already trained sample A.

Steps involved in Random forest (Refer
• Let the number of testing sample B.
• Each and every iteration, the testing samples are compared with trained samples of classifier. • Each tree structure of the forest gives a vote for every iteration K1,K2,K3.
• Finally, All the iteration value (K1, K2, K3…) makes accurate result of classification value K • Repeat the step from 2 to step 5 for getting result for various testing samples.

Simulation results & discussion
In this study, the rice spectrum images are collected from the standard database. The implementation starts with acquiring the database images. Next, FICA was done for clustering with their requirement samples. Features are subsequently extracted from the image which is used for training and testing. In this study, the temporal and spectral features are extracted for finding the adulteration of rice in various levels. The correct grading rate of 30 spectrum images was tested by using random forest classifier. After testing 30 rice spectrum images, in which 29 rice spectrum gave a positive result but remaining one image gave a negative result for finding the adulteration level of rice. The rice spectrum images are collected from the standard database. In Figure 4 represents the THz spectral data for mixed rice powder in 3:0 ratios. In Figure 5 shows the results of reference spectrum for various mixed level of rice powder mixture, based on these reference spectra found adulterated level of various rice powder mixtures in ratio form like 1:1, 1:2, 2:1, 0:3, 3:0. In these reference rice spectra says about classifying main categories of five different ratios. In Figure 6 shows the estimated graph analysis for FICA with their exact cluster value. This estimated value used for future purpose of feature extraction of given spectrum. In feature extraction, Spectral and temporal feature values are calculated for classifying different kind of mixed rice powder in ratios. In FICA the values are started from 100 th to 350 th .   In Figure 7 shows the Random Forest classifier snapshot for the adulteration of mixed rice powder. Below Figure 7 represents the value where the rice mixed with the combo of 3:0 ratios. This is very easy method for finding adulterated product by using image processing. Here there is no chemical used and also there is no persons doesn't involved for finding adulterated product. In this study, automatically found weather product is adulterated or good product. This method little bit consuming time only for trained different kind of adulterated product. Once trained process done, it will provide the adulteration result very quickly. Here, performance metrics based on error rate. So many spectral images were trained, from that 30 images were tested for different kind of adulteration in mixed rice powder. Accuracy was found by using this error rate based performance metrics which means it provides only one negative result for finding different kind of adulteration in mixed rice powder.

Conclusion
The aim of this study, used to found a different kind of adulterated rice mixture. In this project, initially spectrum images were collected from standard database [1]. The valuable information is gathered by using FICA. The result of FICA image was applied to extract the features such as temporal and spectral image to determine the effective data and classify the samples. These features are used to grade the various adulteration levels of rice mixture .Here, different kinds of samples are trained and 30 spectrum images were tested. Then the images were classified by using the random forest classifier, Based on random classifier, instead of the entire rice spectrum provide true positive except only one spectrum image. The accuracy of the adulteration of rice was calculated with the help of yielded test results. The total accuracy of using Random forest classifier was 97.66%.Random forest classifier provided the better results on finding each category on various adulteration levels of