Efficient plastic categorization for recycling and real-time annotated data collection with TensorFlow object detection model

Plastic waste management is the major global issue, and recycling has become a necessary solution to mitigate the impact of plastic waste on the environment. Recycling plastic can significantly reduce pollution by diverting plastic waste from landfills, where it can take hundreds of years to decompose and release harmful chemicals and greenhouse gases. Several systems developed for segregating the municipal solid waste, only few focused on categorizing plastic waste. To address these issues, a plastic waste detection system using TensorFlow pre-trained object detection and MobileNet V2 has been proposed. This work is mainly focused on plastic waste such as PET, HDPE, PVC, LDPE, PP and PS. The proposed system can detect plastic waste category in real time and store the detection information as annotation files in various formats such as json, Pascal voc, and txt. The model saves the detection matrix only when the confidence of prediction is greater than threshold value. This data can be used for fine tuning the model as well as training the new model. To validate the dataset generated by the object detection model, a sample of 54 images annotated by the model is used to train the new model and to ensure that the model is learning from dataset. Furthermore, the proposed system promotes recycling, contributing to the reduction of environmental pollution.


Introduction
Plastic is derived from the Latin word 'plastikos' which means material that can be moulded to any shape convention.Plastics possess qualities such as affordability, low weight, and robustness, making them easy to shape into various items that are utilized in diverse fields.The plastic types were categorized based on their constituent materials, such as Polyethylene Terephthalate (PET), High-Density Polyethylene (HDPE), Polyvinyl Chloride (PVC), Low-Density Polyethylene (LDPE), Polypropylene (PP), Polystyrene (PS), and other materials.According to the report, approximately 8.3 billion tonnes of plastic were produced between 1950 and 2015, and as a result, 5.8 billion tonnes of plastic waste were generated.The study found that around 12% of this waste was burned, 9% was recycled, and approximately 60% was deposited in landfills [1].Due to the considerable harm that plastic pollution has caused to the environment, recycling plastic waste has grown at a rapid pace.However, the precision of plastic waste sorting, which is presently primarily done manually, causes inefficiency and errors and is very important to the success of recycling plastic waste.The authors examined and contrasted various Life Cycle Assessment (LCA) study methodology and results related to the management of plastic waste.The study's objectives and scope, functional units, effect assessment categories, system boundaries, regional context, and uncertainty analysis were among the six factors were taken into account [2].To accurately detect and separate plastic wastes from other materials, object recognition algorithms can recognize the presence and position of objects of interest in an image.Machine learning and deep learning have been focused on several types of research to classify waste for sorting.One of the challenges in plastic waste detection and sorting is the large variation in plastic types and shapes, which makes it difficult to develop a general-purpose system that can detect and sort all types of plastic waste accurately [3].Additionally, to sort the waste using deep learning techniques the model has to be trained with the appropriate dataset, in the case of object detection [4], the annotated dataset has to be used for training.Data collection and data annotation is one of the complex tasks in training the detection models [5].Section I is introduction to plastic waste management; section II outlines the research carried out in plastic waste management using artificial intelligence.Section III includes the details of the proposed system, and section IV explains the data collection and annotation.Section V is about the object detection model, architecture, and training process, Section VI contains the result and evaluation of the model.Section VII represents the comparison between the proposed system with other existing systems and section VIII concludes the work and section IX references used.

Problem background
Plastic pollution is a pervasive problem in both terrestrial and aquatic ecosystems, posing significant concerns for all forms of life.Global plastic production and accumulation in the environment have reached alarming levels, with minimal recycling and a large portion left to degrade in the environment or end up in landfills [6].By integrating object detection to perform wastes classification, which can be connected to robotic system to segregate waste.The model has to trained on wide range of annotated dataset.Building large-scale datasets for object detection requires substantial resources, including time, manpower, and computational power.Annotating a vast number of images with multiple object classes can be time-consuming and expensive [5].

Related works
The existence of other items in the image, which might result in false positives or false negatives, presents another difficulty in the detection and sorting of plastic waste [7].Complex features of plastic waste objects can be learned by CNNs (Convolutional Neural Networks), resulting in more precise and reliable detections for sorting [8].Sundaralingam et al used MobileNet V2 object detection model to segregate the waste generated in the residential area into six categories [9].Various pre-trained models, such as VGG-16, MobileNet, Alex Net, ResNet, and DenseNet121, have demonstrated superior performance in waste object classification tasks.The smart waste classification system is proposed by Adedeji et al, in which distinct types of waste are categorized using the ResNet-50 CNN model and SVM (Support vector machine).The system had an accuracy of 87% when tested on a dataset of trash images [10].A model based on CNN and Graph-LSTM deep learning techniques to detect and classify waste materials on a conveyor belt in waste collection systems developed by Li et al The system is trained on six object classes and achieves 97.5% accuracy [11].Zhou et al suggested a methodology that, with only 30 occurrences per category, achieves a mean average precision of 31.16% over 12 waste categories [12].Zhang et al recommend the YOLO-Trash multi-label waste classification model based on transfer learning to effectively categorise various waste categories.A multi-label waste image dataset is produced to enhance the model's learning effectiveness [13].The authors used the Stanford TrashNet dataset and internet-sourced images to train ResNet101, EfficientNet-B0, and EfficientNet-B1 with a dataset collection of 6640 images.[14].Frost et al introduce CompostNet, a cutting-edge deep learning image classification model that separates waste from meals into compostable, recyclable, and landfill components [15].Sousa et al propose a two-step deep learning approach for waste detection and classification in food trays and the method uses object detection techniques that allow higher resolution bounding boxes to support the classification task [16].RecycleNet is an optimised deep convolutional neural network architecture that uses fewer parameters than a 121-layered network to categorise a subset of recyclable object groups [17].The models deployed in the tests include Pretrained VGG-16 and Alex Net, and the models obtained an accuracy of about 93% on a waste image database of about 400 images for each class [18].The system uses a pre-trained TensorFlow model for object detection and waste classification, and the LoRa is used to send data about the location of the bin and its fill level in real time [19].Krizhevsky et al trained a deep convolutional neural network on 1.2 million high-resolution images from the ImageNet LSVRC-2010 contest yielded remarkable results.In spite of complexity in training the model with high-resolution images, authors achieved remarkable top-1 and top-5 error rates of 37.5% and 17.0%, surpassing previous benchmarks [20].Zhou et al used hyperspectral data from space sensors to detect and identify various plastics using deep learning and machine learning models.The investigation involves aircraft data and data from GF-5 and PRISMA satellites.The approach employs a combination of machine learning models for multi-label classification [21].Neo et al conducted a comprehensive review on employing chemometric techniques in waste sorting with diverse spectroscopic and chemometric tools.The study highlights the need for expanded plastic waste categorization, suggests hybrid spectroscopic methods, emphasizes open-sourced databases for plastic spectra, and underscores the underutilization of innovative machine learning tools like deep learning in plastic sorting [22].Similarly, Yan et al introduces a novel method combining spectroscopy and machine learning models to discern inorganic components and characterize organic compounds in residual wastes for energy utilization [23].
While there have been some developments in waste sorting and detection technologies, there are still some gaps that need to be addressed to fulfil the significant research gap in plastic recycling and also in collecting the data for training the deep learning models.The proposed a plastic waste detection system that uses TensorFlow pre-trained object detection and MobileNet V2, which can detect plastic waste and store the object's information in various formats.Compared to previous work, this system is specifically designed for plastic waste detection, while the previous work focused on waste classification and recycling in general.Additionally, the proposed system can store object detection information as annotation files in multiple formats, making it more versatile for different applications.Overall, this work focuses on the critical research gaps in plastic recycling and contributing to the reduction of environmental pollution.

Plastic waste detection system
The proposed plastic waste detection system was developed using TensorFlow pre-trained object detection, MobileNet V2, to detect plastic waste.The model was trained for 8000 steps on the Google Colab platform and a frozen inference graph was exported for testing on a local machine.During real-time testing, the model evaluates the confidence score of each detected object and only outputs the object's bounding box coordinates and class label if the confidence score exceeds the threshold of 0.8.The detected object's information is then stored in an annotation file that can be used for future training or fine-tuning of the model.All real-time images will be saved regardless of their classes, only for the images with the higher threshold annotated files will be generated.The detection system stores the annotation file in various formats, such as JSON, Pascal VOC, and TXT, ensuring compatibility with various deep learning frameworks.Furthermore, this approach can aid in reducing

Data collection and labelling
The accuracy and efficiency of the model is depending on the data used for training.To collect data, webcam was used to capture images of plastic waste objects in various environments and conditions.The captured images were of dimension 640 ×480, which provided high-quality visual information for the model to learn.
Data generator library is used to increase the images in order to improve the dataset's quality and quantity.this method entailed performing several changes to the original images in order to create the new images with the multiple angles and viewpoints displayed and also rotation, scaling, flipping, and altering brightness which is shown in figure 2. This process can increase the diversity of the dataset and improve the model's ability to generalize to new and unseen images.
After collecting and augmenting the data, the images were annotated using a widely-used tool called LabelImg.Annotation is the process of manually labelling the objects in the images with their corresponding class labels and bounding box coordinates.This is a critical step in object detection as it provides the ground truth labels that the model uses to learn from.In this case, the plastic waste grouped into six categories based on type of recyclable material used.
The annotation process involved identifying the object of interest in the image and drawing a bounding box around it.The class label was then assigned to the bounding box, and the coordinates of the box were stored in the file with an XML extension of pascal voc format.Totally 1537 images were collected for training and evaluating the model.The distribution of the dataset was shown in the figure 3.

Object detection model
Object detection models are a type of computer vision algorithm that identify the presence and location of objects within an image or video.Unlike classification models, which only classify an entire image as a certain category, object detection models can detect and locate multiple objects within a single image and label each object with its corresponding class.Object detection models typically use a combination of CNN, anchor boxes,  objectness features, and non-maximum suppression to identify and locate objects within an image.Object detection algorithms often achieve higher accuracy than classification algorithms because they take into account the spatial relationships between different parts of an object.This is particularly useful in complex scenes where objects may be partially occluded or overlapping.In real-life situations, various types of plastic waste may be mixed, making it difficult for classification models to distinguish between them accurately.However, this issue can be addressed by utilizing object detection techniques [4].

MobileNet V2
MobileNet is a widely used pre-trained object detection model for object detection tasks due to its high performance and low computational cost.The rationale behind selecting the MobileNet object detection model is because of its lightweight structure and optimized design make it particularly well-suited for tasks requiring real-time processing and resource efficiency.The utilization of depthwise separable convolutions significantly reduces computational demands while maintaining an acceptable level of accuracy.This fits the goal of achieving precise object detection without compromising on speed, making MobileNet the most suitable option for the specific project needs.The basic idea behind object detection with using a number of convolutional layers, MobileNet will first extract feature maps from the input image before using those feature maps to recognise items in the image.
Anchor boxes is a technique with which MobileNet use to find objects in the image.At each spatial position in the feature maps, these are predefined bounding boxes that have various scales and aspect ratios were placed.The model then predicts the offset values for each anchor box, which define the position and size of the bounding box around the object of interest.
The model also predicts an objectness score for each anchor box, which indicates the probability that the box contains an object of interest.This score is calculated using a logistic regression function based on the features inside the bounding box and a set of learned objectness features.
To remove overlapping detections and select the best candidate bounding boxes, MobileNet uses a technique called non-maximum suppression (NMS).NMS selects the highest-scoring bounding box for each detected object and removes any overlapping boxes with lower scores.MobileNetV2 also uses a technique called inverted residuals to improve accuracy.Inverted residuals use a residual connection to combine the output of the bottleneck with the original input tensor, but the input tensor is first passed through a 1×1 convolution to reduce its dimensionality before being combined with the bottleneck output.This reduces the number of computations required to compute the residual [24].

Layers of MobileNet
Convolutional layer: A combination of filters, generally referred to as the kernels, that are used by this layer to execute convolution operations on the input image.For this layer, the following equation (1) is used.
Pointwise convolution: To reduce the number of channels in the input feature map this layer initiates 1×1 convolution operations, pointwise convolution layer represented using equation (2) to reduce the number of channels.
Depthwise convolution: The input feature map is processed individually for each channel by this layer using depthwise convolution techniques.This can be described using equation (3) Batch normalization: To minimize the internal covariate shift and increase the model's convergence, this layer normalizes the activations of the preceding layer.Equation (4) is used in this layer to perform the action.
where mean[k] and Var[k] are the mean and variance of the activations for channel k, respectively, and epsilon is a small constant used for numerical stability.ReLU activation: To provide nonlinearity to the model, this layer applies the ReLU activation function shown in equation (5) l m n l m n , , max 0, , , 5 where c is the channel index of the input feature map. the input feature map's row, column, and channel indices are u, v, and c.The output feature map's row, column, and channel indices are l, m and n.

Training the model on plastic waste data
The model was trained in Google colab for 8000 steps with the batch size used for training is 4, the batch size is set to be small which helps the model to converge faster.On the other hand, it requires more computational time, but the training was carried out in a google colab GPU environment because the model took less computational time.The losses during the training were computed to assess the model's performance.The regularisation loss, classification loss, and localization loss were determined to be 0.12, 0.09, and 0.03 respectively.The overall loss of the model was calculated to be 0.

Evaluation results
The plastic waste detection model achieved an Average Precision (AP) of 0.827 for the IoU range of 0.50 to 0.95 and maximum detections (maxDets) of 100, indicating a relatively good performance in detecting plastic waste which is shown in figure 5, Whereas maxDets refers to the maximum number of detections considered for a single object instance during evaluation.The model showed high precision with an AP of 0.986 for IoU = 0.50 and IoU = 0.75, suggesting that the model is highly accurate in identifying plastic waste.The Average Recall (AR) for the model at IoU = 0.50:0.95and maxDets = 1 was 0.759, the figure 6 represents the average recall of the model.At IoU = 0.50:0.95and maxDets = 10, the model showed an AR of 0.868, indicating that the model was able to detect about 87% of the plastic waste objects present in the images.The AR at IoU = 0.50:0.95and maxDets = 100 was also 0.868, indicating the consistent performance of the model at higher detection thresholds.The evaluation of the model is shown table 1.The precision and recall values for small and medium-

Annotated data collection
During real-time predictions, the model will generate a detection matrix that captures the identified objects.These detection results will be saved as annotation files with various extensions such as Pascal VOC, TXT, and JSON.Specifically, when the confidence level of a detection is equal to or higher than 0.8, the system will store both the image and its corresponding annotation files, which is illustrated in figure 8.

Training and evaluating the new model on annotated data generated
The object detection model generated a dataset which was then used to train a new model to assess dataset quality and to determine whether the model was able to learn from the dataset.The training results showed that the model had a total loss of 0.25999224, with a classification loss of 0.07342744 and a localization loss of 0.040199466.The regularization loss was 0.14636534 and the learning rate was 0.0796716 represent in figure 9.The model was trained for 3000 steps with only 54 images were used in the dataset, with 9 images in each plastic waste category.This suggests that the dataset size may not be sufficient to produce accurate results, and a larger dataset may be required to improve the model's performance.Even though few images were used for training, model performs well in classify the object and not in locating the objects which is shown in side-by-side evaluation figure 10.Overall, this states that the annotated dataset generated from the model can be used to train the new models.

Comparison between the proposed model and existing models
In table 2 the existing waste classification systems segregate plastic into single unit, further classification based on the plastic category has not been done in most of the works.In [1] the author has considered only four categories.But on the proposed work all the six categories of plastics were been taken into account.Furthermore, the proposed work can also generate annotated data which has not been proposed in any of the previous works.

Conclusion
Plastic recycling is crucial for the environment, and improved sorting technologies and increased awareness can lead to significant benefits.The proposed plastic waste detection system using TensorFlow pre-trained object detection and MobileNet V2 shows potential for automating waste sorting processes, reducing labour costs, and increasing recycling efficiency.The model's performance evaluation results demonstrate that the model is highly accurate in identifying plastic waste but struggles with smaller and medium-sized objects due to dataset limitations.However, these limitations can be addressed by using the combination spectroscopic methods and chemometric tools with deep learning models [22,23] and identifying microparticles using scattering of light [32] can also be used to solve the problem in identifying micro-objects.In instances where the system encounters diverse plastic items that may not be promptly classified, images that the model struggles to predict accurately are stored without generating an annotation file and these images can then be manually annotated by experts.These newly annotated data can be utilized to fine-tune the model, incorporating the information from these challenging scenarios.This iterative process of data collection, manual annotation, and model fine-tuning ensures that the system continuously improves its performance, even when encountering a wider range of plastic waste items, such as differently coloured plastics, plastic shopping bags, squeezed plastic bottles, and other variations Furthermore, saving the detection matrix information as an annotated file along with the image can be a valuable resource for future training.This approach can provide additional information about the object's location, size, and shape, which can help improve the model's performance in detecting and classifying plastic waste.This annotated data can also be used to train other models or to create new datasets for future research.And also, the annotated file can be stored in any format such as pascal voc, txt, and json which can be used to train all types of deep learning frameworks.Furthermore, by widely adopting this process, a repository of annotated files can be created.This repository can serve as a valuable resource for solving various problems more efficiently.Having a collection of annotated files allows for quicker access to labelled data, which is crucial in training and developing machine learning models.This comprehensive repository can expedite the solution of numerous problems across different domains, as it provides a ready-made dataset for researchers and practitioners to utilize in their work.Overall, this research highlights the potential of using deep learning models for improving plastic recycling and waste management.By promoting the recycling of plastic, the usage of virgin plastic will reduce dramatically.And also, for the same type of product different types of plastic materials were used, consider the blisters, PET as well as PVC material were used in the manufacturing.Both of the materials were identical in appearance, which make hard for classify one from other.To avoid these types of issues, authorities have to initiated ground rules to all manufacturers to uses same type of plastic for similar products.
With further research and development, will reduce the negative impact of plastic waste on the environment and contribute to building a sustainable future.

Figure 1 .
Figure 1.Workflow of the proposed system.

Figure 3 .
Figure 3. Frequency of each plastic category in the dataset.

equation ( 7 )=
24.The regularization loss is the regularization applied to the model during training to prevent the model from overfitting and also to keep the model's complexity to a minimum.The classification loss is the difference between the prediction and the actual class of the object in the image.The localization loss is the difference between the ground truth bounding box and the predicted bounding box for each object in the image.The total loss is a sum of the regularization, classification, and localization losses.It represents the overall performance of the model during training.The total loss of the model is 0.2, which shows the better performance of the model.After the training, the frozen inference graph of the model was imported to a local machine for real-time testing.The loss graph is illustrated in figure 4, the graph is plotted using TensorBoard.TensorBoard is visualization tool used to track the performance of the model.The training log file of the model is provided as input for the TensorBoard, based on the log file the graph will be plotted.The results of this study demonstrate the potential of the MobileNet V2 model for detecting plastic waste in images.The losses calculated during training provide insights into the model's performance and can be used to optimize the model for future applications.The MobileNet object detection model uses a combination of regularization, classification, and localization losses to calculate the total loss during training.The equation for calculating the total loss in the MobileNet object detection model can be expressed in equation (6)Total Loss Regularization Loss Classification Loss Localization Loss6 λ is the regularization parameter weights that represent the model's learnable parameters, such as the weights and biases of the convolutional layers.true class label of the object (0 or 1), ŷ represents the predicted class probability by the model in equation(8) In equation (9) smoothL1 is a function that smooths the L1 loss and reduces its sensitivity to outliers, ground truth represents the true bounding box coordinates of the object.predicted represents the predicted bounding box coordinates by the model.The losses are backpropagated through the model to adjust the learnable parameters and improve the model's accuracy.

Figure 4 .
Figure 4. Loss of the model for 8000 steps.

Figure 5 .
Figure 5. Mean Average Precision (mAP) of the model across different IoU thresholds during evaluation.

Figure 6 .
Figure 6.Average Recall of the model of the model across different IoU thresholds during evaluation.

Figure 7 .
Figure 7. (a) Side by side evaluation in TensorBoard (b) Prediction results of the plastic waste detection system.

Figure 8 .
Figure 8. Annotated file generated from the model's detection.

Figure 9 .
Figure 9. Losses of the model trained using the annotated dataset.

Figure 10 .
Figure 10.Side by side evaluation of the model in TensorBoard for generated dataset.

Table 1 .
Precision and Recall of the waste detection model for 8000 steps.

Table 2 .
Comparison between the proposed work with the existing work.