Development of Tomato Septoria Leaf Spot and Tomato Mosaic Diseases Detection Device Using Raspberry Pi and Deep Convolutional Neural Networks

Machine learning techniques are revolutionizing multiple industries, various researches have been put forward as regards mitigating pest and disease effect on food production. The ability to identify plant disease on time can help reduce the level of destruction caused by the diseases. This paper proposes the use of Deep Convolutional Neural Network (DCNN) as classification technique using keras and tensorflow python machine learning libraries to build a model deployed on a hand-held raspberry pi device for on-site plant disease classification. Convolutional Neural Networks (CNN) can automatically recognize interesting areas in images which reduces the need for image processing, training images were gotten from plantvillage.org and split into training, testing and validation sets, the training images were augmented and fed into a DCNN model for training the model was then tested on the test set to check against overfitting before finally used to detect disease on the validation set which showed very positive results. Results from this research shows that DCNN and the framework in this paper can be used to develop highly efficient plant disease detection models.


INTRODUCTION
Modern technology is giving farmers methods of producing more food, however food production is still quite low. The Food and Agriculture Organization (FAO) of the United Nations (UN) estimates a raise in global population to about 9.6 billion people by 2050 with need to match that figure in food production to increase by 70 percent by 2050 [1].
Food production is being reinforced with smart computing technology from environment control, disease detection, to disease prediction. Machine learning techniques are championing the trend of smart farming [2].
Plant leaves are very vital parts of a plant, being the major channel for photosynthesis which is a major source of nutrition and growth for plants. This research is directed towards detecting two tomato diseases, the tomato septoria leaf spot and tomato mosaic diseases, these diseases destroys the surface of tomato leaves by creating lesions which reduces the photosynthetic area for light penetration which in turn reduces yield [3], [4]. The model generated from this research can also detect healthy tomato leaves.
With open-source devices like raspberry pi and Arduino becoming faster, more readily available due to cheap prices and customizability, it has also allowed developers and engineers to build tools and devices for multiple industries. This project is aimed at building agricultural solution for tomato disease detection using DCNN machine learning technique deployed on raspberry pi devices.

RELATED WORKS
Several approaches and research have been carried out for plant disease detection within the machine learning domain, the authors of [5], demonstrated an approached that used 5 convolutional neural networks layers (CNN) to detect northern leaf blight (NLB) infections on plant leaf images gotten from farms, their proposed method achieved an accuracy of 96.7% accuracy.
Random forest has also been used by Dalphyet.al., (2016) [6], to predict mummy berry infection caused by Monilinia vaccinia-corymbosi fungus on blue berry plants. This method proposes the possibility of predicting the occurrence of a diseases rather than detecting it, hence their sources of data was not images but rather they collected and monitored weather, plant, and soil conditions which was fed into random forest algorithm and producing the highest accuracy of 78%. RNNs are great at memorizing temporal patterns which in this case was applied to time series data as recurrent connections to learn temporal and sequential data for predicting future infections. This research [7], curated historical data of rice blast from three regions in South Korea, Cheolwon, Icheon and Milyang with the aim of using a variety based Long-Short Term Memory (LSTM) Recurrent Neural Network (RNN) to forecast early rice blast infections on different varieties of rice. The accuracy scores were different based on the regions and data sizes of each region.
Brahimiet.al., (2017) [8], proposed a system for detecting tomato diseases based on CNNs leveraging its automatic ability to detect important features in images, with 14,828 images to classify into 9 classes, they were able to score an accuracy of 99.18%. The training was carried out on the ImageNet model with the output layer replace to return 1 out of 9 classes rather than 1000 classes contained in ImageNet. They also went further to visualize the infected regions of the plant leaves using heat maps and image segmentation.
Image processing techniques and been combined with some machine learning models to build highly efficient model, according to [9] they were able to use segmentation techniques to separate infected areas of images and support vector machines (SVM) to detect Phytophthora infestans commonly known as late blight and Alternaria solani (early blight) in potato plants. Segmentation was also aided in removing the background pixels of the image since they don't contain information for disease detection. This approach was able to achieve 95% accuracy over 300 images. Convolutional Neural Networks have proven to be very effective in producing more accurate results and its tailored nature towards images gives it great advantage in disease detection since cameras can serve as means of data retrieval.

METHODOLOGY
Convolutional neural network was selected for this research because of its good performance on image dataset which is suitable for representing real world data since it can be consumed from camera on numerous devices [10] [11]. The CNN was built using keras machine learning API with tensorflow for GPU as backend, trained over 643 images per class. This model was built and trained on a computer with 6 gigabytes of graphics memory, training the model on a raspberry pi device proved abortive

Data acquisition
Dataset for this project was gotten from plantvillage.org [12] and split the dataset into even number of images for each category to make sure the model does not memorize any particular class over another one, 643 images were selected for each class, 80% was used for training and 20% of that data was used for validation [13] and the rest were used for testing. The images were from a high-resolution camera which might have an effect on results using low resolution cameras [14].

Building the model
After acquiring the data and splitting into the necessary sets for training and validation a multiclass CNN was built with the architecture shown in Table 1.

Training and validation
The training process involved, flipping rotating, zooming and resizing the images to sought of augment the dataset before it is fed into the network, this way the model can adjust to different images regardless of the size, or orientation [15]. The model then goes through training and validation on the dataset for 100 epochs, after which it achieved an accuracy of 99.02% in training and 99.01% in validation as shown in Figure 2.

Classification
This is the final stage of the process where the CNN is used to classify the images into either of the three classes, testing it with completely different images to see its performance, where the model showed high accuracy of 99.98% and 99.99% respectively as depicted in Figure 3 and

Deployment
A raspberry pi model B+ is the device used for testing the project, it has a camera depicted in Figure 5, 1gigabyte of RAM, 32 gigabytes of storage and powered by a 20,000 milliamps power-bank.
The trained model was then transferred to a raspberry pi device where it can be used as a handheld device by farmers to perform disease detection, prior to this farmer might have to take the leaves to a lab or guess the health status of a plant which could be time consuming and inaccurate respectively.
The model was tested on the device and produced similar results as was gotten during the test of the application when the model was built. Mobile phones could be handier but the combination of these components makes it cheaper for the farmers compared to mobile phones with capabilities of running machine learning models.

CONCLUSION
The research focused on developing a mobile/handheld device for detection of tomato septoria leaf spot and tomato mosaic diseases using deep convolutional neural networks and image dataset from plantvillage.org. The CNN was able to automatically extract the relevant features from images and have a validation accuracy of 99.01%. This model can be used by farmers to automatically diagnose leaves for diseases, however readings from farm lands might vary because of the image dataset being taken within a controlled environment.
Future improvements could be to build a tensorflow lite (tflite) versions of such models so they can run on mobile devices as this will give these farmers the tools at their hands to make diagnosis in a familiar interface and assuming the cost of good smart phones drops to compete with the raspberry pi.