Detection of Pine Wood Nematode Infestation Based on Improved YOLOv5

Pine wilt disease is a destructive forest disease caused by the parasitism of pine wood nematodes inside pine trees. Once infected, pine trees quickly wither and die due to the inability to drain water, hence it is also known as pine tree withering disease. It is mainly spread by pine sawyer beetles, with the characteristics of fast dissemination, rapid onset, and high mortality rate, making it a significant global plant disease. Failure to effectively control the pine wilt disease in a timely manner will result in the massive death of pine trees in a short period of time in forest areas. With the continuous development and popularization of artificial intelligence and UAV (Unmanned Aerial Vehicle) technology, combining various methods can timely detect diseased trees, and performing timely treatment and protection can greatly save the time and personnel costs of biological pest control. While it is also conducive to promoting the work of multiple related disciplines, such as bioengineering and greening engineering. In this paper, we first conducted a large number of aerial surveys on the forest area through UAV, collected relevant data, and preprocessed the obtained data, including data augmentation, cleaning, filtering, deduplication, formatting, etc., to ensure data quality and accuracy. Then, we labeled the data, and made certain improvements to the original YOLOv5. We added a new RRAM (Recurrent Residual Attention Module) to the original network model, which enables the network to timely focus on important information in redundant data, thus improving the network’s performance. Compared with the original YOLOv5, our network has a stronger performance.

enables simultaneous completion of object detection and positioning tasks with only one forward pass on the entire image, resulting in very fast performance.In addition, there are many research works based on YOLOv5 for improvement and optimization, such as using convolutional neural networks (CNNs) to improve detection accuracy, introducing attention mechanisms to enhance model robustness, etc.These research works constantly promote the development and innovation of target detection technology, providing more reliable technical guarantees for achieving more precise and efficient object detection applications.
The most obvious feature of pine trees infected with pine wilt disease is that the color of the forest changes, forming a sharp contrast with the surrounding healthy pine trees, and can generally be judged by visual inspection.In the early stages, due to technological limitations, traditional manual inspections were usually carried out by forestry technicians in the field, which was time-consuming, labor-intensive, and inefficient.With the development of UAV remote sensing technology, using UAV for aerial monitoring of diseased pine trees has greatly improved the efficiency of pine wilt disease surveys.Although UAV remote sensing technology can effectively improve survey efficiency, it still has not solved the problem of requiring a great deal of personnel.In recent years, with the rapid development of computer technology, deep learning can be used to sample and recognize most of the pine wilt disease images collected by UAV, greatly reducing the amount of manual work, improving the monitoring efficiency of diseased pine trees, and reducing monitoring costs, thus restraining the rapid spread of pine wilt disease.
In this paper, we selected the region of Zhoushan in Zhejiang Province, China, where pine wilt disease is relatively severe.Multiple drone flights were conducted in October 2021 and October 2022.After the flights were completed, image quality checks were performed, and low-quality images were removed.Then, orthorectification processing was carried out to form UAV orthoimage maps.The obtained data were annotated and a dataset was finally obtained.Meanwhile, in terms of models, we also improved the model based on the characteristic of a large amount of redundant data.We proposed the RRAM, which combines the recurrent residual structure and attention mechanism, to enable the network to adapt to the characteristics of noisy or redundant information in images collected by UAV.This further improves the performance of the network.
In this paper, our contributions are as follows: • We collected multiple aerial images of pine forests at different times using UAV and cleaned and annotated the images.Based on this data, we established a relevant dataset.• We improved the network model based on the characteristics of high redundancy and high noise in forest area data obtained by aerial photography.We proposed RRAM to address these issues.Through our ablation experiments, we found that this module increased the network's precision on the forest dataset by 3.26%.

2.RELATED WORK
Pine wilt disease is a devastating disease that affects various species of pine trees worldwide.The disease is caused by the pine wood nematode, which invades the resin canals of pine trees and results in wilt, death, and eventually, economic loss [5].One effective way to control the spread of this disease is to detect and isolate infected trees promptly.
In recent years, remarkable advances in remote sensing technology have made UAVs a popular tool for quickly and efficiently identifying infected trees in large pine forests.Equipped with high-quality cameras, these UAVs can capture high-resolution images of the forest canopy, which can then be analyzed to detect signs of infection.However, manual analysis of these images can be prohibitively time-consuming and expensive, rendering it impractical for surveying large areas.
To address this issue, researchers have developed machine-learning algorithms for target detection in UAV images.These algorithms use a combination of deep learning techniques and computer vision to analyze images and locate infected trees.In particular, object detection algorithms such as Faster R-CNN, YOLO, and SSD have shown promising results for detecting infected trees in UAV images [6].These algorithms can identify the location of infected trees and provide accurate information on the extent of the disease, enabling targeted interventions to halt the spread of the disease.
To improve the accuracy of object detection algorithms, researchers have also developed new datasets and training methodologies tailored to UAV images of pine forests [7] [8].For example, the development of a large-scale dataset of UAV images from pine forests has allowed researchers to train object detection models specifically for pine wilt disease detection.Furthermore, new training methods such as transfer learning and data augmentation have been utilized to improve the performance of object detection models with limited training data [9].

3.PROPOSED METHOD
In this section, we obtained forest area image data from multiple locations at different time periods through aerial photography and created corresponding datasets for subsequent training.We improved and fine-tuned the YOLOv5 model based on the data characteristics, making it more suitable for detecting pine wilt disease in forest areas.

3.1.UAV Aerial Forest Area Dataset
Due to natural limitations, the vegetation in the forest areas of the Zhoushan Islands mainly consists of pine forests that are tolerant to poor soil conditions.Moreover, as the forest areas are scattered across many islands, detecting large areas of forest and continuously collecting images proves difficult.Taking only the 12 streets in Dinghai District, Zhoushan City as an example, the area covered by aerial drone photography reached 300 km 2 , with over 60,000 images collected and nearly 500 GB of data to be processed.The manual processing of such a huge amount of image data was challenging and timeconsuming.
Therefore, we performed preliminary detection and image screening tasks of diseased trees during the initial stage of UAV image collection.The preliminary screening results were then sent to the ground station for further analysis, thereby greatly improving the processing efficiency through a two-stage collaboration mode.
We primarily used the YP-07 composite wing UAV, jointly developed with Zhoushan Cihang Intelligent Technology Co., Ltd., for this study.Over 20 aerial photography missions were conducted in Zhejiang Province, where pine wilt disease is prevalent.The total aerial coverage was approximately 300  2 and was undertaken during clear and windless weather conditions between 10:00-11:40 in October 2021 and October 2022 to prevent image distortion and overexposure.After checking the quality of the images, we performed orthorectification processing to generate orthophoto images from the UAV images, as shown in Figure 1.
After obtaining the images, we manually labeled them and cropped them to 640 × 640 resolution.We used data augmentation algorithms to generate more than 10,000 samples, which were split into an 8: 2 ratio for training and validation sets.We ended up with the final dataset, as shown in Figure 2.

3.2.Recurrent Residual Attention Module
During aerial photography, image features are mainly reflected in two aspects.The first aspect is the redundancy of data volume.Most of the data in a single image are often similar due to the natural environment of the forest.The other aspect is that the collected images often contain some noise.This is mainly due to the interference of the UAV during flight and the influence of various factors such as lighting on the shooting equipment mounted on the UAV.This noise or redundant information may have adverse effects on recognition tasks.Therefore, we combine the recurrent residual structure with the attention mechanism based on the characteristics of aerial photography data in forest The entire RRAM is shown in Figure 3.At the same time, we add RRAM to the backbone part of the YOLOv5, and the improved network is shown in Fig. 4.
ResNet [10] introduces residual connections to alleviate network degradation and enable much deeper networks.The residual connections help the network learn and generalize features effectively.The architecture of ResNet-v2 further reduces the number of parameters while maintaining high accuracy by using bottleneck blocks and recurrent residual connections.These techniques have validated and achieved SOTA (state-of-the-art) performance on large-scale datasets such as ImageNet.Attention mechanism [11] refers to the ability to automatically focus on relevant information given an input, thus enhancing model performance and interpretability.This technique has been widely applied in various neural networks, such as CNNs, RNNs (recurrent neural networks), and Transformers.We summarize the entire process of the module in the following formula:  = ((), ), (1) where  represents the recurrent residual block,  represents the attention mechanism, and  is a hyperparameter learned by the network.Its main function is to use the learned threshold to reduce the interference of noise in the sample for the network.

4.1.Implementation Details
We conducted experiments on the NVIDIA GeForce RTX 3060 experimental device with an input resolution of 640 × 640 for 500 epochs.The experimental environment was Pytorch version 1.8.0.

4.2.Evaluation Indicators
The evaluation metrics used in this study are precision (), recall (), and their harmonic mean ().These three metrics can objectively reflect the accuracy of the constructed deep learning model to a certain extent, as shown in the following formulas:  =  / ( + ), (2)  =  / ( + ), (3)  = 2 / ( + ), (4) where  represents the number of correctly detected as pine wood nematode-infected wood,  represents the number of non-infected wood misclassified as pine wood nematode-infected wood, and  represents the number of infected wood that was not detected.

4.3.Results
We conducted a set of ablation experiments on RRAM, and the experimental results are shown in Table I.YOLOv5s version was selected as the baseline, and under our test, the s-version with a smaller parameter size can meet the development needs of our onboard edge computing platform for pine wood nematode-infected tree monitoring system.
In the experiments, all our improved versions achieved good performance (we have highlighted them in bold).In the ablation experiment, we designed to replace the recurrent residual module in RRAM with the ordinary residual module.We found that its performance was slightly improved compared to the original YOLOv5, but there was still a certain gap compared to the recurrent residual module.We also show the effect charts of the experiment in Fig. 5.In Figure 5(a) and Figure 5(b), we can find that the left YOLOv5s missed a pine wood nematode-infected tree next to the highway, which was successfully detected by our improved network.In Figure 5(c) and Figure 5(d), we can also see that the detection results of the left YOLOv5s were not as good as those of the improved network.

5.CONCLUSION
In this paper, we introduced our work.Firstly, we took aerial photographs of forest areas in multiple regions of Zhoushan, Zhejiang Province through unmanned aerial vehicles at different times, obtained relevant data, and cleaned, labeled, and enhanced the data to obtain a dataset of unmanned aerial vehicle aerial photography forest areas.Secondly, we improved the YOLOv5 according to the characteristics of the dataset and proposed RRAM based on the recurrent residual and attention mechanism.The experimental results showed that our improvement enhanced network performance.

TABLE I .
ABLATION EXPERIMENT