Research on garbage sorting robotic arm based on image vision

This paper aims to investigate the application of artificial intelligence in robotic arm automation control for the accurate identification of different types of garbage using deep learning algorithms. The goal is to enable the robotic arm to autonomously classify and handle garbage. The appropriate garbage classification dataset was selected and subjected to data preprocessing in this study. After comparing various well-established convolutional neural network models, including VGG16, InceptionResNetV2, Xception, and InceptionResNetV3, in terms of performance and suitability on the target dataset, the Xception model, which exhibited the best performance metrics, was selected for this research. Subsequently, the paper optimized the model by incorporating self-attention mechanisms, self-optimization strategies for learning rate, learning weight adjustments, and unfreezing of pre-trained layers, resulting in a predictive accuracy of 96.9% on the test set and an AUC area of 0.9989. Additionally, the paper simulated the robotic arm in a simulated environment and successfully achieved the objective of automatic garbage identification and classification using the developed model.


Introduction
In recent years, robotic arms have been widely used as efficient automation control systems in various fields such as industrial production, warehouse logistics, and healthcare.Currently, extensive research has been conducted on the motion control optimization of robotic arms.For instance, the LWPR realtime learning algorithm has been proposed to assist robotic arms in effectively updating and adapting to environmental information [1].Distributed control strategies have been studied to enhance the performance and flexibility of modular robotic hands [2].A mathematical framework based on symbolic computation has been introduced to improve the computational efficiency of inverse dynamics and inverse kinematics equations [3].Furthermore, various motion planning strategies for robotic arms have been investigated [4], including linear programming, nonlinear programming, and genetic algorithms [5].However, the majority of robotic arm control and optimization still primarily focus on predefined parameters and rules, lacking research on intelligent control and autonomous decision-making, which limits their autonomy and flexibility.
By incorporating AI techniques, machine arms can adapt to more complex and diverse environments and task requirements, bringing new possibilities for the automation control of traditional robotic arms.Currently, researchers on CNN models have proposed mature models such as VGGNet [6], ResNet [7], and DeepFace [8], which have gradually been applied, such as in machine learning models for garbage image classification [9] and garbage recognition models based on mobile terminals [10].However, there are still limitations in terms of a limited number of classification categories, insufficient accuracy, and constrained application scenarios.
This paper aims to propose a novel model based on CNN for garbage image classification.The model will be utilized for robotic arm control to achieve automated garbage sorting.Additionally, a comparison will be conducted among existing mature CNN models to select the most suitable model for garbage image classification.Subsequently, the model will be further optimized in aspects such as attention mechanism and learning rate to achieve higher accuracy and integration with the robotic arm.

Dataset and Data Preprocessing
The dataset used in this study consists of 15, 596 garbage images, including 12 categories.For detailed data, please refer to Table 1.To improve the accuracy and reliability of the model, data preprocessing is performed on the dataset.The specific process is as follows: ∂ Undersampling the dataset, 607 random images will be selected from each class of garbage images in the original dataset.∂ The images are resized and cropped to a consistent size of 150x150 pixels while preserving the RGB color channels.Furthermore, the images are subjected to normalization to standardize their pixel values.∂ Following the ratio of 8.5:1:0.5, the dataset is divided into the test set, validation set, and training set, respectively.The resulting data distribution after the split can be found in Table 1.

Experimental Equipment and Parameter Settings
The equipment information used in this study is shown in Table 2.

Model Comparison and Selection
This paper conducts training on the same garbage classification dataset in a consistent environment.We compare the performance of different models in the field of garbage classification.Additionally, we design a simple convolutional neural network (CNN) model as a baseline for comparison.The baseline model has a simple CNN structure consisting of four pooling layers and two fully connected layers.The specific configuration is described in Table 3.
During training, the server configuration is given in 2.2.We use stochastic gradient descent (SGD) as the optimization algorithm with a momentum parameter (β) of 0.9.The batch size is set to 64, the initial learning rate is 0.01, and the models are trained for 100 epochs.We also freeze the pre-trained layers for all models.In the test, we compare the size, accuracy, average precision, recall, and AUC (area under the curve) of each model, and the results are recorded in Table 4.
Based on the test results, it is evident that the existing CNN models outperform the simple recognition model in terms of performance for garbage classification.Among the model comparisons, the Xception model achieves a higher accuracy.Therefore, it can be concluded that the Xception model is better suited for the garbage classification dataset used in this study.Learning rate optimization strategy.To improve the convergence speed, stability, and generalization performance of the model, this paper adopts the ReduceLROnPlateau learning rate selfoptimization strategy.This strategy dynamically adjusts the learning rate based on the model's performance on the validation set.When the model's performance on the validation set shows no significant improvement over several consecutive epochs, the ReduceLROnPlateau function reduces the learning rate by a factor to help the model break out of a local optimal solution and continue searching for better solutions.

Learning weight adjustment.
By analyzing the prediction results of Xception, this study found that certain classes had a significant impact on the overall predictions or were noticeably lower than the average.Therefore, this study attempted to improve the performance of the model by adjusting the learning weights of these classes to balance the training process.After reducing the weight of the "clothing" class and increasing the weight of classes such as "plastic" and "white_glass", the model's performance metrics showed some improvement.

Unfreezing pretrained layers.
Pretrained models trained on large-scale datasets can capture richer image feature information.In this paper, the approach of unfreezing pre-trained layers layer-by-layer is adopted.Starting from the top layer of the model, the layers are gradually unfrozen one by one until all the layers are unfrozen.The performance metrics of the model on the test set are observed to find the optimal unfreezing approach.
2.4.5.Results.Besides, the optimization method used in this paper is the accumulation of optimizations.Each subsequent optimization is introduced and built upon the previous optimizations.It can be seen that in Table 5 with the continuous accumulation of optimizations, the model's performance metrics have improved.

Experiment results and analysis
In this study, the Xception model was chosen and optimized on a garbage classification dataset.The final model achieved an accuracy of 96.9% on the test set.Additionally, the average precision and recall based on the test set predictions reached 96.9% and 96.8% respectively.Moreover, the average Area Under the ROC Curve (AUC) was 0.9989.The computational time of the model was approximately 0.06-0.08seconds, and detailed data can be found in Table 6.

Simulation Environment and Robotic Arm Configuration
For the robotic arm automatic control section, this paper chose CoppeliaSim software to build the simulation environment.When constructing the robotic arm environment, we selected the LBR_iiwa_14_r820 robotic arm simulation model.R820 is a robotic arm produced by KUKA, equipped with 14 independently movable joints, allowing the robotic arm to perform complex actions and tasks in multiple degrees of freedom.In addition, we chose small blocks to simulate garbage and used a conveyor belt for transportation.

Garbage Classification Process
When the object is transported to the front of the robotic arm via the conveyor belt, a signal is sent to control the robotic arm's operation.Subsequently, the robotic arm classifies the garbage based on images captured by the camera and outputs specific joint rotation angles based on the classification results to transport the object to the designated classification bin.Once completed, the robotic arm returns to the initial position, waiting for the conveyor belt to transport the next piece of garbage to be classified.Figure 1 illustrates the entire process of recognition and classification using a battery as an example.

Conclusion
This paper presents an automatic control method for a robotic arm using convolutional neural networks to build an image recognition model for automated garbage classification.The method selects the Xception model, which is more suitable for garbage classification datasets, and adjusts and optimizes it to achieve a high accuracy rate of over 96%.Compared to other models, the accuracy has improved by an average of approximately 8%.In specific cases, it can serve as an effective method for garbage categorization.Additionally, the paper utilizes the predictive results of the model to achieve automatic control of the robotic arm.The robotic arm can automatically identify the category of garbage on the conveyor belt and transport the garbage to the corresponding location, accomplishing automated garbage sorting.The proposed method in this paper can effectively recognize different categories of garbage and utilize the robotic arm for classification, demonstrating wide-ranging prospects.However, this method faces some limitations in more complex task environments.The model can only recognize 12 categories of garbage, which poses certain restrictions.Moreover, the robotic arm encounters difficulties in grasping certain challenging garbage items.Future research can be conducted in the following areas: ∂ Feeding more diverse datasets of garbage images to enhance the model's ability to recognize multiple categories.∂ Optimizing the end effector of the robotic arm and the grasping mechanism to improve accuracy in gripping garbage with different shapes.

Figure 1 .
Figure 1.Automated control process of the robotic arm.

Table 1 .
Data set specifics.

Table 2 .
Experimental Equipment and Parameter Settings.

Table 3 .
Simple CNN Model for Comparisons.

Table 4 .
Model Performance Comparison.Based on the original Xception model, this paper attempts to optimize the efficiency and performance of the model during training, based on the feedback of performance metrics on the test set.The following optimizations were tried.
2.4.1.Attention mechanism.By introducing a self-attention mechanism, this paper selectively weights the features by calculating the correlation between different positions in the input feature map, making the model more focused and sensitive in processing input data.In the calculation, the Scaled Dot-Product Attention formula is used to calculate the attention weights by calculating the similarity between queries and keys and multiplying it with corresponding values.

Table 5 .
Model optimization results.