Gesture Recognition of sEMG Signals Based on CNN-GRU Network

To improve the accuracy of surface electromyogram signal (sEMG) gesture recognition algorithm and solve the problem of manually extracting many features, this paper proposes a deep neural network-based gesture recognition method. A neural network integrating CNN and GRU was designed. The 8-channel sEMG data collected by the MYO armband is input to the CNN for feature extraction, and then the obtained feature sequence is input to the GRU network for gesture classification, and finally the recognition result of the gesture category is output. The experimental findings that the proposed technology reaches 76.41% recognition accuracy on the MyoUP dataset. This demonstrates the practicality of the suggested plan.


Introduction
EMG signal gesture recognition has become an increasingly important means of achieving humancomputer interaction with the rapid growth of artificial intelligence and the ongoing advances in science and technology.EMG signals are electrical signals generated by muscle discharge during exercise.Through the analysis and processing of these signals, rich information such as muscle activity status, strength, and hand posture can be obtained.Therefore, EMG signal gesture recognition technology has broad application prospects in fields such as medical rehabilitation, smart prosthetics [1] , industrial robotics [2] , natural human-computer interaction [3] ,virtual reality, and gaming.
EMG signal gesture recognition is one of the current hot spots in deep learning research.With the help of deep learning models, accurate and detailed features can be extracted from EMG signals for efficient gesture classification.However, current studies show that the limited diversity in certain gestures across sEMG recordings hampers the performance of the mode, thus requiring further exploration of optimization methods for EMG signal gesture recognition technology.Electro myoelectric gesture recognition serves as input to a deep neural network model.For example, ZHAI et al. [4] used fast Fourier transform to extract signal spectrum from surface EMG signals and input it into CNN model for hand identification.SHEN et al. [5] extracted many domain features and converted them into images, which were used as input to stacked CNN models for EMG gesture recognition.Although the above method achieves higher gesture recognition accuracy, it ignores the end-to-end feature learning capacity of the CNN model to some extent.Furthermore, RNNs have advantages over time series data.Therefore, scholars began to combine CNN and RNN for the study of myoelectric signal gesture recognition.For instance, Yu Hu [6] et al. puts forward a hybrid CNN-RNN attention-based model, which achieved a certain recognition accuracy, but the calculation time is long, and the hardware requirements are high, making it unsuitable for the actual control process.Therefore, this essay suggests a technique for gesture recognition for myoelectric signals based on the CNN-GRU combination.First, convolutional layers are used to extract characteristics from the original signal, and then the obtained feature sequence is input into the GRU network for gesture recognition.Finally, the superiority of the algorithm is proven compared to other classification models.

MyoUP Database
The MyoUP dataset is a public dataset for EMG signal classification, designed to evaluate the performance of EMG signal recognition algorithms.It was created by a team of researchers at McMaster University in Canada and published in the journal IEEE Transactions on Biomedical Engineering in 2013.
This dataset was acquired using the Myo armband, which features a sample frequency of 200 Hz is used, as well as 8 sEMG channels.The Myo armband has been used in research [7][8][9] .MyoUP dataset includes data from 8 participants (3 women, 5 men; 1 left-hand case, 7 right-hand cases; mean age 21-23 years).As shown in Figure 1, this dataset records five fundamental finger motions, twelve isometric and spherical hand configurations, and five gripping gestures, each of which is repeated five times, taking a second break each time to avoid muscle fatigue.
This dataset has been extensively used in the field of myoelectric signal classification, including gesture recognition, finger activity recognition, human action recognition, and other areas.Researchers can use this dataset to develop and test the effectiveness and accuracy of EMG signal classification algorithms, providing important support for research in the field of EMG signal classification.

Convolutional Neural Networks
CNN proposed by Le Cun et al. [10] in the 80s of the 20century.It is mainly used in the task of classification and recognition of high-dimensional data.Compared to traditional fully connected neural networks, CNN use convolutional structures to extract spatial relationships in data.This method greatly reduces a lot of parameters that the network needs to learn.Additionally, CNN employ pooling operations to compress feature information further and limit the possibility of overfitting.The classical CNN include input, convolutional, pooling, fully connected and output layer comprise the structure.The convolutional and pooling layers are the core components responsible for extracting spatial features and reducing the spatial size of feature mapping.
CNN also has some unique characteristics, such as shared local field of view, weight sharing [11] , making it capable of achieving excellent performance in fields image classification [12] , object detection [13] , and face recognition [14] ,and becoming an important part of the field of deep learning.

Gated Circulation Unit
As a variant of recurrent neural networks, gated recurrent units add gating mechanisms to RNN to solve the issues of gradient vanishing and gradient explosion faced by traditional RNNs.Similar to LSTMs, GRU can also process sequence data more efficiently and has strong memory capabilities.
The GRU (Gated Loop Unit) is similar to the LSTM [15] and consists of three components: input, forget, and output.Unlike LSTM, the GRU has only one gating unit, update gate, which controls the current state is updated.Compared with LSTM, GRU has fewer parameters, faster calculation, and easier training.
is the input vector of the current time step, the hidden state vector is ℎ  , (•) is the sigmoid function, ⊙ represents bitwise multiplication, W, U, and b are weight matrices and bias vectors.
The update gate  controls the hidden state from the preceding time step ℎ −1 updates the hidden state of step ℎ  ,   when it is close to 1, it means that most of the previous state information needs to be retained; When the   is close to 0, it means that more consideration needs to be given to the input information for the current time step.The reset gate   controls whether the previous state information needs to be forgotten at the current time step, so as to better extract new features from the current input, and can alleviate the gradient vanishing problem.

CNN-GRU Architecture
The design of a new CNN-GRU architectural in this paper, as seen in Figure 2. It consists of 32, 32, 32, 64, 64 filters and five convolution blocks.After performing convolution, a nonlinear activation function is applied, adding the maximum pooling layer after convolution 2nd and 4nd.Furthermore, the discarded layers come after the second and fourth convolutional blocks with a probability of 0.15.GRU units were added to the fifth layer for sequential modeling, adding three dense layers to the network, each with ReLU nonlinear and dropout layers linked after each dense layer.

Experiment Environment and Parameter Settings
The experiment uses Intel i7-11800H processor, NVIDIA GeForce RTX3060 graphics card, 16GB memory, and the training and testing of neural networks are mainly completed through the GPU of the graphics card.Using the foundation for deep learning PyCharm is the integrated development environment for TensorFlow.80% of the samples were from the experimental training set, while 20% came from the test set.

Experimental Results
In this experiment, the MyoUP dataset was classified using deep learning methods, and Figure 3 is the accuracy curve of gesture recognition in CNN-GRU networks.We can observe that the accuracy gradually improves with the rise of epoch, which to a certain extent represents that the fitting ability of the model is continuously enhanced, and the accuracy of gesture recognition reached a maximum of 76.41% when the epoch was 50.
The partial recognition results of the final fusion model are also shown in the form of confusion matrix, as shown in Figure 4, where each column shows the prediction class, each row shows the real class, and the numbers on the diagonal represent the correct recognition results of each type of gesture.From the results, the final recognition rate can meet expectations, and the recognition effect is relatively good.  1 shows, and the classification precision of the three models is 72.06%, 73.12 %and 76.41%, respectively.CNN and CNN-LSTM are slightly inferior to CNN-GRU models in prediction.

Conclusions
Through this experiment, we explore the gesture recognition method of EMG signal based on CNN+GRU combination, and analyze the advantages and disadvantages of this method.Firstly, the proposed method can extract features and achieve classification more accurately by combining CNN and GRU networks, so as to achieve high recognition accuracy.Secondly, this method has advantages in computing time and hardware equipment use, and can be applied to the actual control process of bionic hands.However, this method still has some shortcomings, such as the difficulty of model parameter adjustment, the high cost of training data collection and labeling, etc., which need to be further optimized and solved.Therefore, we can consider improving from the following directions: increasing the sample size, optimizing the network structure, introducing incremental learning and other technical means to further improve the performance and practicability of the method.

Figure 1 .
Figure 1.All gestures of the MyoUP database.

4 .
Parameter Settings and Experimental Results.

Table 1 :
Comparison of gesture recognition accuracy of different models