Decoding of unimanual and bimanual reach-and-grasp actions from EMG and IMU signals in persons with cervical spinal cord injury

Objective. Chronic motor impairments of arms and hands as the consequence of a cervical spinal cord injury (SCI) have a tremendous impact on activities of daily life. A considerable number of people however retain minimal voluntary motor control in the paralyzed parts of the upper limbs that are measurable by electromyography (EMG) and inertial measurement units (IMUs). An integration into human-machine interfaces (HMIs) holds promise for reliable grasp intent detection and intuitive assistive device control. Approach. We used a multimodal HMI incorporating EMG and IMU data to decode reach-and-grasp movements of groups of persons with cervical SCI (n = 4) and without (control, n = 13). A post-hoc evaluation of control group data aimed to identify optimal parameters for online, co-adaptive closed-loop HMI sessions with persons with cervical SCI. We compared the performance of real-time, Random Forest-based movement versus rest (2 classes) and grasp type predictors (3 classes) with respect to their co-adaptation and evaluated the underlying feature importance maps. Main results. Our multimodal approach enabled grasp decoding significantly better than EMG or IMU data alone (p < 0.05). We found the 0.25 s directly prior to the first touch of an object to hold the most discriminative information. Our HMIs correctly predicted 79.3 ± STD 7.4 (102.7 ± STD 2.3 control group) out of 105 trials with grand average movement vs. rest prediction accuracies above 99.64% (100% sensitivity) and grasp prediction accuracies of 75.39 ± STD 13.77% (97.66 ± STD 5.48% control group). Co-adaption led to higher prediction accuracies with time, and we could identify adaptions in feature importances unique to each participant with cervical SCI. Significance. Our findings foster the development of multimodal and adaptive HMIs to allow persons with cervical SCI the intuitive control of assistive devices to improve personal independence.


Introduction
Chronic motor impairment can severely impact a person's daily life routine.Depending on the severity of the impairment, activities of daily life (ADL) such as personal hygiene or food intake can not be performed without help.Reasons for a loss of motor function range from neuropathological conditions such as stroke to traumatic brain or spinal cord injury (SCI) [1,2].Especially for persons with a traumatic SCI at the cervical level, circumstances turn out to be life-changing, since not only lower extremities and autonomic body functions are affected, but also upper extremity functions namely reaching and grasping.Despite significant advances in trauma medicine and rehabilitation, no causal therapies for SCI are currently available, with a cure not to be expected in the foreseeable future.Of course, affected persons still seek for interventions to overcome their impairment or at least mitigate its effects to gain more independence and a self-dependent life.In studies with people with tetraplegia, more than three-quarters rate regaining upper extremity functions as their highest priority [3,4].
Since physio-and occupational interventions based on task-specific therapies have limited potential to achieve functional recovery in people with severe paralysis, technical interventions incorporating assistive devices may potentially foster selfdependence.Decisive attempts have been made using non-invasive electroencephalographic (EEG) based approaches to detect end user's movement intentions [5][6][7][8][9][10][11].In this case, neurophysiological signals are captured directly on the end user's scalp, bypassing the lesion site in the spinal cord.These socalled Brain-computer interfaces (BCIs) attempt to decode movement-based brain signals in near realtime and attempt to generate control signals for assistive devices, such as robotic arms [12] or grasp neuroprosthesis [7,[13][14][15].Unfortunately, EEG-based BCI approaches for movement intention detection and control are still limited in performance inter alia by unfavorable signal-to-noise ratios, missing signal robustness, and significant inter-session variability.Even minimal changes in position, blinking, or physiological attributes create substantial noise and mask signal information.This leads inevitably to unsatisfying detection accuracies and high false positive command rates.In addition, user compliance is notoriously affected by lengthy retraining sessions, EEG-cap comfort, and aesthetics [16].Recent studies have shown that a considerable number of persons with SCI often retain voluntary functions below the level of the lesion.Even in clinically complete lesions, a few fibers of long fiber tracts are in most cases preserved so that some motor commands and sensory feedback can pass the lesion site [17].These very low muscular activities do not generate any meaningful movement but can be detected using EMG [18,19].Modern machine learning methods allow the use of these signals as reliable and intuitive source for assistive device control with accuracies not only surpassing those of BCIs [19,20] in general, but also promise a more robust signal behavior and substantially less training time.Additionally, many persons with cervical SCI show some preserved shoulder and elbow function that can be captured via inertial measurement units (IMUs).Studies with non-disabled persons have already used similar kinematic information for hand activity [21,22] or gesture [23] recognition.
Multimodal EMG-IMU human-machine interfaces (HMI), where both sensor modalities aim to complement each other, have been shown to achieve an even better movement classification performance in comparison to single-modality HMIs of only one sensor modality in people without disabilities [24].They were also applied to various challenges such as hand gesture classification [25], sign language interpretation [26], as well as hand and back injury assessments [27,28], but have yet to be translated to persons with SCI.
Generally, the performance of an HMI does not solely rely on appropriate signal capture and feature engineering but also on the selection of suitable processing methods such as machine learning [25,26,29].Previous experiments have shown that tree-based models such as Random Forests seem to perform well in a supervised multimodal HMI for activity recognition [24].Additional factors such as a co-adaptive environment can further enhance HMI performance [30,31].Hereby, human and machine interact through closed feedback loops in a mutual learning environment.The user receives constant feedback based on his actions, while the machine receives new training samples over time and adapts to the human.This can potentially stabilize HMI performance by mitigating non-stationary properties and helps compensate for effects over time, such as fatigue and sensor signal degradation [31][32][33][34].
In our current work, we present a co-adaptive HMI that uses both EMG and kinematic IMU data of residual upper limb muscle activities and movements of persons with cervical SCI to decode unimanual and bimanual reach and attempted grasp actions in real-time.We initially collected EMG and kinematic data of non-disabled persons (control group) during self-initiated reach and grasp tasks and used their data to optimize feature engineering and hyperparameter tuning.In the subsequent series involving persons with subacute cervical SCI (n = 4), we assessed the co-adaptive decoding performance of the HMI involving three different reach and attempted grasp actions-a unimanual lateral, unimanual palmar and bimanual palmar grasp, which are sufficient for routine ADL such as eating or drinking [35].We believe that our findings will support the development of multimodal and co-adaptive HMIs that allow persons with SCI to control assistive devices intuitively and herewith to regain personal independence.

Participants
We assigned all participants to one of two groups: a control group of non-disabled participants (n = 13) and a group of participants with cervical SCI (n = 4).Control group participants (six female, seven male) were aged between 22 and 42 years (29.4 ± STD 6.1 years) and were right-handed except for one participant.The hand-dominance test of Steingruber confirmed the handedness [36].Participants with SCI (all male) were aged between 32 and 66 years (52.7 ± STD 15 years) and self-reported to be (originally) right-handed.Their neurological characteristics can be found in table 1.Primary inclusion criteria for this group were (i) a time since injury above  four weeks, (ii) sufficient remaining voluntary control of shoulder and elbow for the experiment, and (iii) severely impaired to no hand function.

Experimental design
All experiments were approved by the ethical committee of Heidelberg University, Heidelberg, Germany (S-078/2022) and carried out at the SCI Center of Heidelberg University Hospital, Heidelberg, Germany.Participants gave their written informed consent.Control group experiments were carried out as offline sessions for intermediate, post-hoc evaluation.After analysis of all control group data, participants with SCI partook in a modified version of the experiment with an online co-adaptive closed-loop HMI, which already incorporated the intermediate control group findings.At the start of the experiment, all participants were seated in front of a desk.We positioned a paperweight and a jar at a comfortable reaching distance.Participants reached for them and either performed a unimanual lateral grasp (paperweight) or a uni-or bimanual palmar grasp (jar).Both unimanual grasps had to be performed with the right hand.As for the participants with cervical SCI, they were instructed to attempt to grasp the objects to the best of their abilities.In order to detect movement and grasp onsets, we sensorized regions on the table and objects using foil pressure sensors.
While the control group experiments were conducted in an open-loop manner, participants with SCI received audio feedback after each trial-one beep for successful prediction of the reaching movement and two beeps for the correct prediction of the grasp type by the system.Each experiment was divided into eight blocks, in which participants had to perform five selfinitiated reach-and-grasps towards one of the objects in a random order of their choosing (see figure 1).Between blocks, object positions were swapped to mitigate any directional confounders and participants remained in a resting state for one minute.We performed manual trial rejection during the experiments to record a total of 40 trials per condition (TPC) and at least eight minutes of rest per participant.

Data recording and preprocessing
We acquired EMG data (f s = 1200 Hz) from eight upper limb muscles (see figure 2(a))-the deltoid muscle (i) pars acromialis and (ii) clavicularis, (iii) biceps and (iv) triceps, (v) brachioradialis, and (vi) extensor carpi radialis longus, as well as the (vii) flexor digitorum superficialis and (viii) opponens pollicis muscles.Electrodes were positioned according to the guidelines of the European SENIAM project [38].We used Ambu Neuroline 720 electrodes (Ambu A/S, Ballerup, Denmark) in a bipolar electrode setup and a g.USBAmp bio amplifier for recording (g.tec medical engineering GmbH, Schiedlberg, Austria).We further acquired accelerometer and gyroscope data as well as the orientation of five ICM-20 948 IMUs (InvenSense, San Jose, CA, USA) with f s = 12 Hz.They were fixed at (i) the central upper back, (ii) the shoulder, (iii) the upper, and (iv) lower arm as well as on the (v) back of the hand (figure 2(b)).Our data acquisition framework was built upon the lab streaming layer (LSL) [39], which allowed us to perform softwaresynchronization of all data in real-time.We only sensorized the guide arm rather than both limbs since we wanted to keep the experiment's complexity and the preparation time for the participants to an absolute minimum.All analysis was done using MATLAB 2022b (Mathworks, Massachusetts, USA).We prefiltered all EMG data using a sixth-order 50 Hz notch filter to mitigate power line noise and applied a causal fourth-order 20-450 Hz Butterworth bandpass filter for movement and high-frequency artifact reduction and calculated the bipolar derivative for each muscle.All IMU data were first transferred into a shared coordinate system (reference: central upper back IMU).We corrected all accelerometer readings for gravity and converted each IMU orientation from quaternions into the Tait-Bryan angles pitch, yaw, and roll (rotation sequence z-y'-x").This resulted in a total of eight EMG (one per muscle) and 45 IMU (three directions x three modalities x five IMUs) channels.

Feature extraction and engineering
We extracted features every 50 ms using causal 250 ms windows of each channel's data.Twelve EMG features described properties of the time-frequency domain or probability density estimation, four IMU features were extracted from the time domain.The EMG time domain features were the (i) mean absolute value [40], (ii) band power, (iii) root mean square [19], (iv) zero crossings [40], (v) slope sign changes [40] and (vi) Willision amplitude [41], as well as the (vii) mean absolute value slope [40] and the (viii) waveform length [40].We extracted the ix.) mean [42] and (x) median frequency [42] of the frequency domain and the (xi) entropy [43] and (xii) trimmed mean [43] of a probability density estimation.IMU features were the (i) mean absolute value [23], (ii) root mean square [23], (iii) mean absolute value slope [23] and (iv) waveform length [23] of the time domain.Table 2 gives an overview of all features, their domain, and calculations.In total, this resulted in a feature vector containing 591 features (12 EMG features x 8 EMG channels + 4 IMU features x 45 IMU channels) per extraction window.

Post hoc analysis control group
We used the data of the control group experiments to determine the general feasibility of grasp type decoding based on different modalities (EMG, IMU, or the combination).For this, we established a window of interest (WOI) of [−1.75, 0] s with respect to the grasp onset, which we defined as the moment when the foil pressure sensor on the object detected the first touch of the object.We epoched all data within the WOI to 36 overlapping causal 0.25 s windows (step size 0.05 s).This allowed us to evaluate each window separately and participant-wise (e.g.only the features extracted from the interval [0.50, 0.25] s prior to grasp onsets).For each window, we trained separate Random Forest (RF) predictors [45] using a five-fold cross-validation technique to avoid overfitting (1000 decision trees per RF, no max depth, split criterion: Gini impurity).We chose the RF architecture for our experiments, first to be in line with our previous work in grasp decoding, and second due to its comparably very good performance in activity prediction based on combined data sets of EMG and IMU information [24].We report the grand average of each window's average prediction accuracy over all folds and participants as a measure of performance.In addition, we were also interested in the decoding performance when solely using EMG data or IMU data alone.We applied the same approach but used only

Time Domain
Based on signal x of length N. [19,23] Zero crossings

Frequency domain
Based on a discrete fourier transform of signal x with M bins.P j : spectrogram intensity of bin j, f j : frequency of bin j.

Mean frequency
f j P j [42] Median frequency 1 2

Probability density estimation
Based on the estimation f = (Nh) [44].The trimmed estimation ft excluded the 5% lowest and highest values of f.

Entropy of density −
the respective features for this.In a comparative analysis using a repeated measures analysis of variance (RANOVA) we investigated prediction accuracy differences (p < 0.05) when using EMG or IMU data only as well as their combination.Lastly, we repeated the evaluation of the control group data with the data of the participants with SCI after completion of their experiments for comparative group analysis with the same parameters.

Behavioral analysis
We also wanted to assess the variation of reach-andgrasp timings between the different grasping conditions and among/between participant groups.We hereby defined the read-and-grasp timings as the time difference between movement onset (the moment the hand lifts from the sensorized starting position) and the first touch of an object.For this, we conducted a behavioral analysis to determine these differences (p < 0.05) in reach-and-grasp timings within and between groups using a mixed analysis of variance (mixed ANOVA).In the case of the bimanual reach and grasping condition, we used the earliest lifting of one of both hands.

Co-adaptive closed-loop HMI
Our experiments with participants with SCI included a co-adaptive closed-loop HMI approach that utilized feedback loops between participants and models.For this, we implemented classification models that predicted each participant's data in real-time and enabled the system to give audio feedback based on the success of those predictions.We defined two objectives, (i) the discrimination between movement and rest (MR, 2 classes: movement or rest) and (ii) the discrimination between different grasp types (GT, 3 classes: unimanual lateral, unimanual palmar, or bimanual palmar grasp) to minimize false positive grasp predictions when no grasp intention is present.Participants always received feedback on whether the MR prediction was successful but received only feedback on the GT prediction after a correct MR prediction.We used RFs as predictors (1000 decision trees per RF, no max depth, split criterion: Gini impurity) and retrained them regularly to ensure system adaption: Both MR and GT predictors were (re-)trained after each experimental block of 5 TPC and 1 minute of rest.Their training sets contained the EMG and IMU features of all at the time available trials of the participant, extracted from the 0.25 s prior to grasp onsets and, respectively, from 15 random, non-overlapping 0.25 s intervals of each rest phase.
Training sets were balanced for class frequency and thus contained either (i) movement ( = masked grasp trials) and rest conditions (MR set) or (ii) the grasp type conditions without rest data (GT set).No data was shared across participants.Each model adaption was done using a five-fold cross-validation, of which we saved the best-performing MR and GT models as new main predictors that drove the feedback until the next adaption step.We enabled the feedback after the first experimental block.One beep indicated a correct movement intention recognition (MR model), and two beeps indicated both a correct movement intention and a correct grasp prediction (MR + GT model).This allowed the participants to modify their future actions towards higher MR and GT discriminability.With eight experimental blocks, sessions resulted in seven MR and GT models that each acted as main predictors for one block.Although not used for feedback, deprecated models still predicted incoming data.This allowed us to compare the performance of all models over time.Figure 3 shows this co-adaptive closed-loop HMI.Additionally, we were interested in the difference between groups (control group/participants with SCI) and between modalities (only EMG/IMUs, or both).We evaluated this through multiple post hoc simulations of the closed-loop experiment and investigated for significant findings with a repeated measures analysis of variance (RANOVA).Naturally, our post hoc simulations did not include the auditive feedback loop.

Comparative performance evaluation
We compared the results of the co-adaptive closedloop experiments between and within groups by evaluating the performance of each iteration's prediction accuracy.We further compared the feature importance of MR and GT predictors of each participant with SCI to grand control group averages.We extracted each RF's mean decrease in Gini impurity blockwise for this and calculated averages over MR or GT predictors.We combined the importance of the three directional parameters of each IMU's modality (e.g. the importance of an IMU's acceleration in x, y, and z are summed up).Values were then expressed relative to their participant's or group's maximum and presented graphically, such that information source and feature type importance are expressed simultaneously.

Behavioral analysis
Figure 4 depicts the group-wise reach-and-grasp timings for participants of the control group and participants with SCI per condition.Over all conditions, control group participants had an average timing of 1.03 ± STD 0.24, and participants with SCI 1.39 ± STD 0.91.We applied a mixed ANOVA (F(1, 10) = 28.6,p < 0.001) on the reach-and-graps timings.Mauchly's test indicated that the assumption of sphericity was not violated.We found significant differences between the reach-and-grasp timings of both groups.Post hoc tests for multiple comparisons using the Tukey-Kramer criterion showed significant differences between (i) groups in each condition (p < 0.01 for all comparisons) and (ii) lateral and palmar grasps in both groups (p < 0.001 for all comparisons).There was no significant difference in uni-and bimanual palmar reach-and-grasp timings within both groups.

Assessing the best classification interval and modality
Analysis of control group data for grasp type decoding showed dependencies on (i) the temporal positioning of the feature extraction window and (ii) the selection of included features.Figure 5 depicts the grand average classification accuracies of only EMG and IMU features as well as of the combined (multimodal) approach at extraction time points up to 1.75 seconds before grasp onsets.Classification accuracies were higher the closer a 0.25 s extraction window was to the grasp onset (peak 98.67% at t = 0 s, multimodal approach).All group average accuracies <1.5 s prior to a grasp onset exceeded the chance level (40.86%, adjusted Wald interval with α = 0.05).We found significant differences between modalities using a repeated measures ANOVA with F(2, 934) = 273.9,p < 0.001, corrected by Greenhouse-Geisser due to a sphericity violation (Mauchlys p < 0.05).While the multimodal approach performed best with an average accuracy of 77.84 ± STD 24.17%, features extracted from IMU data were slightly more discriminable than EMG features (73.46 ± STD 23.95% > 69.90 ± STD 22.92%).In participants with SCI, we found significant differences between modalities using a repeated measures ANOVA with F(2, 286) = 72.34,p < 0.001, corrected by Greenhouse-Geisser due to a sphericity violation (Mauchlys p < 0.05).Average accuracies were 66.85 ± STD 15.80% (multimodal approach), 61.90 ± STD 14.97% (IMU only) and 61.79 ± STD 13.89% (EMG), respectively.Pairwise comparisons incorporating the Tukey-Kramer criterion showed only significant differences between the multimodal and the single modality approaches (p < 0.05 for both comparisons) but not between EMG and IMU features.All classification approaches had their peak accuracies directly prior to the grasp onset, and group averages exceeded the chance level of a random classifier.

Online predictor performance
Our co-adaptive closed-loop HMI correctly predicted on average 79.3 ± STD 7.4 of 105 grasping trials and was able to identify movement in all remaining ones during the experiment with participants with SCI (table 3).No trial was simultaneously predicted wrong by both MR and GT predictors.Table 4 sums up the prediction accuracies at each model iteration.All averaged MR and GT predictors performed above chance levels (MR: 71.98%, GT: 55.04%, adjusted Wald intervals with α = 0.05).Grand average prediction accuracies peaked around 99.64 ± STD 1.89% for movement versus rest (MR, stage one model ) and 75.39 ± STD 13.77% for grasp prediction (GT, stage two model).True positive rates for the GT models were in general higher for lateral vs. palmar grasp discrimination than for uni-vs.bimanual grasp prediction.The comparison between iterations of the grand average GT predictors per iteration shows an increase in performance until the 4th iteration, followed by a comparable stable prediction from there on (see figure 6).Both the group average and the individual results (appendix figure A1) show their GT predictions peaks mostly based on the newly trained predictors.
The simulation of the online experiment with only EMG or IMU features resulted in grand average accuracies of 99.52 ± STD 1.19% (EMG only) and 99.29 ± STD 1.89% (IMU only) for MR and 70.48 ± STD 14.45% (EMG only) and 67.38 ± STD 21.42% (IMU only) for GT predictions (table 4).We could not find significant differences between modalities using a repeated measures ANOVA with F(2, 54) = 0.49 091, p > 0.05.In contrast, modalities showed significant differences for GT prediction with F(2, 54) = 3.3896, p < 0.05, with the multimodal approach performing best.Simulations of the online experiment with control group data with both EMG and IMU information resulted in, on average, 102.7 ± STD 2.3 correctly predicted trials out of 105  A2).We found significant differences between modalities using a repeated measures ANOVA with F(2, 180) = 35.486,p < 0.05 (corrected by Greenhouse-Geisser due to sphericity violations with Mauchlys p < 0.05) with the multimodal approach performing best for GT predictions.

Feature importance
Our models differentiating between movement and rest relied on many features of different origins but showed adaption to individual participants with SCI (figure 7).We found that the control group's grand average rated most proximal EMG, gyroscope readings, and the distal mean absolute value slope IMU features highest.Models generally did not consider the feature types waveform length, mean, and median frequency as well as features from the triceps and the acceleration and orientation of the IMU on the back.All models of the participants with SCI relied on the EMG of the deltoid muscle (especially pars clavicularis), but some also included distal EMG information, for example, of the thumb muscle or finger flexors (participants S-01 and S-04).In contrast to the control group, IMU features were either not considered (most proximal channels) or showed importance scattered across more distal channels.Similar to the control group, models generally rated the feature types waveform length, mean, and median frequency as not important.Further, all models of the participants with SCI rated acceleration and orientation information as more important than gyroscope readings.However, EMG features were treated differently across participants: features of the biceps were, for example, not included in participants S-01 and S-04, and those of the thumb were not in participants S-02 and S-03.Models that differentiate between the three grasp types showed clear prioritization of especially channel-wise information (figure 8).Average control group models mainly utilized the EMG of the thumb muscle as well as orientation readings, whose importance increased in more distal extremity parts.Other IMU information was neglected, similar to the EMG of the triceps and deltoid (pars clavicularis) muscles.Feature types rated as not important were the waveform length (both modalities) as well as the mean absolute value slope (IMU only).Especially important feature types were the mean absolute value and root mean square, as well as for EMG information, the mean absolute value slope, band power, trimmed mean and entropy of density.Models of the participants with SCI focused on the EMG of the acromial part of the deltoid muscle (participants S-01 and S-03), of the brachioradialis (participant S-02) or the opponens pollicis muscle (participant S-04).They neglected distal gyroscope readings but showed scattered importance over the remaining IMU channels.The important EMG feature types were the same as in the control group.

Discussion
In this study, we could show the feasibility of the multimodal usage of EMG and IMU information for movement and grasp intention decoding of persons with SCI.Our models performed significantly (p < 0.05) better with both input sources than with isolated EMG or IMU data only.We found the features extracted from the 0.25 s prior to the grasp onset to hold the most discriminative information and were able to correctly predict 79.3 ± STD 7.4 (102.7 ± STD 2.3 control group) trials out of 105 in our real-time, co-adaptive closed-loop experiment with participants with SCI.We achieved grand average grasp prediction accuracies of 75.39 ± STD 13.77% in participants   with SCI and 97.80 ± 4.77% in the simulated, adaptive control group experiment, as well as movement vs. rest prediction accuracies above 99.64% with 100% sensitivity in both groups.We could show the superiority of co-adaptive closed-loop interactions between human and machine compared to open-loop classification and found clear feature prioritization in models of the homogeneous control group.Further, models of the participants with SCI uniquely adapted to each individual participant's capabilities in real time.

Offline analysis
We found our multimodal approach of considering both EMG and IMU data as input sources for classification to result in significantly (p < 0.05) increased grasp type prediction accuracies within both groups when compared to the isolated use of either the IMU or the EMG modality.This complies with related findings on the multimodal usage of EMG and IMU signals for activity classification of Bangaru et al [24] and shows that the key to a successful multimodal approach is the appropriate utilization of complementary sensor information and an appropriate feature extraction.Our analysis regarding the best classification time point showed that data directly prior to a grasp onset hold the most discriminative information for grasp classification, similar to findings of EEG-based reach-and-grasp classification in persons with SCI [46].This means that this information is best suited to train classification models for further use and should be prioritized for training data sets.Nonetheless, we also found above chance level grasp type classification closely prior to the movement onset, showing that the preparation for the more than one second later occurring grasp movement already holds significant discriminative information, probably incorporating factors such as shifts of the torso and changes in muscle tension.

Behavioral analysis of reach-and-attempted grasp movements
The group of participants with acute SCI consisted of persons with varying upper limb motor capabilities and was, as such, highly heterogeneous.Although they were all able to perform the defined reach-andgrasp conditions successfully, they attempted their grasp movements with varying degrees of deviation from the optimal end-point trajectory.For example, an often-seen movement pattern was the lateral grasp without large-scale adjustments (strong flexion or extension) of the index finger or object clamping in bimanual palmar grasps to compensate for weak finger strength with extensive arm and shoulder strength.Compared to the control group, we could notice significant differences in reach-and-grasp timings in people with SCI.They generally needed longer (p < 0.001) and showed a higher variation of reachand-grasp timings overall and for each grasp type individually (p < 0.01 for all comparisons).We additionally found that both groups showed significant differences in the reach-and-grasp timings of lateral and palmar grasps (p < 0.001 for all comparisons).We hypothesize that group-wise differences and variations are not only caused by the lower sample size of the participants with SCI (n = 4) but are mainly originating from their heterogeneous neurological and functional status with respect to upper extremity muscle functions.This resulted in varying reach-andgrasp strategies with on-the-fly adaptions.Due to the long experiment time, increased muscle fatigue might have also contributed to the variations.The varying reach-and-grasp timings support the use of data extracted relative to the grasp onset and not to the start of a reaching movement, as rigid offsets (e.g. 1 s after the movement onset) can not account for the observed high reach-and-grasp timing variability.

Online single trial prediction
Our co-adaptive closed-loop HMI was able to predict correctly, on average, 79.3 ± STD 7.4 (102.7 ± STD 2.3 control group) trials out of 105 during sessions with the participants with SCI.No single trial was not predicted as a movement by the movement vs. rest predictors (100% sensitivity, > = 99.64%accuracy).Grand averages of grasp predictors were at 75.39 ± STD 13.77% (participants with SCI) and 97.66 ± STD 5.48% (control group), respectively, and simulation runs showed that our multimodal approach performed significantly better (p < 0.05 for both comparisons) than isolated EMG or IMU information for grasp type predictions in online settings.We could exceed the grasp decoding performance of similar, EEG-based work with persons with SCI [46] and, although not directly comparable due to different experimental paradigms, the results of real-time EMG-based decoding of 10 individual hand and wrist movements in non-disabled people [29] and of EEG-based grasp decoders for non-disabled persons with SCI [10,11].Our control group models further performed similar to or better than the hand gesture classification models of Vásconez et al [25].We found higher prediction accuracies between palmar and lateral than between uni-and bimanual grasps and hypothesize that a sensorization of both limbs allows for higher discrimination of uni-and bimanuality.Nevertheless, we could observe the superiority of block-wise adapted predictors for grasp type differentiation in comparison to rigid open-loop models at almost every time point during the experiments (see figure 6).They integrated the most recent information, including sensor data as well as the user's adaption to a predictor's output until human and machine were fully adapted to each other.

Feature importance
We evaluated all predictor models to identify highly discriminable features for both movement vs. rest and grasp type prediction.In the control group models, proximal EMG information, gyroscope readings, and distally generated mean absolute value slope IMU features were found to hold the most discriminative information for movement vs. rest differentiation.
Grasp type differentiation mainly used information of the thumb muscle opponens pollicis and each IMU's orientation.We found our models to adapt to each individual participant with SCI (see figures 7 and 8) with different importance scores for individual EMG or IMU features.There was no case in which only EMG or IMU features were considered while the other modality was fully neglected.Models often rated features from paralyzed muscles or segments of the upper limb as less important than ones from muscles with preserved residual muscular functions.For example, grasp predictors preferred the distal EMG information of the brachioradials (participant S-02, Neurological Level of Injury (NLI) C4, American Spinal Injury Association Impairment Scale (AIS) C) or focused on the proximal EMG information of the acromial part of the deltoid muscle (participant S-03, NLI C4, AIS B).We further identified features that were not rated as important.As such, the EMG feature types waveform length and the frequency measures mean and median frequency were rated as not important across both models and groups and, despite their role in control group movement vs. rest differentiation, gyroscope readings seemed to be neglected by almost all models in participants with SCI.
Our control group findings show that reach-andgrasp movements start with shoulder and upper arm muscular activity, and grasp types differ mainly in thumb movements and wrist rotation.We believe that our models of the participants with SCI highlight the heterogeneity of their group and further show that our co-adaptive approach enables unique adaptation to individual capabilities in real-time.In this respect, one has to keep in mind that the feature importance, especially of participants with SCI, does not necessarily reflect neurological characteristics but the availability of discriminative information within acquired data.

Transfer to end users
We could show in principle that our multimodal usage approach of EMG and IMU information for movement and grasp detection in real-time successfully translates to persons with SCI.Hence we believe that intuitive control and a transfer to end users on a larger scale is not out of reach for persons with movement impairments.The comparatively small sample of end users with SCI in this study already showed high variability in their residual movement capabilities, which is also partly reflected in the STD of the overall decoding performances.This indicates that for a successful transfer, the residual capabilities of end users need to be assessed individually so that a closely tailored optimal placement of sensors can be performed.While the discriminative information per trial is higher than in comparable EEGbased approaches [9,11], our co-adaptive closed-loop approach further demonstrated that lengthy, externally supervised calibration procedures can be further decreased while still maintaining high performance.This can lead to an earlier start and generally higher efficiency of rehabilitation interventions, such as functional electrical stimulation (FES) based training, as the overall need for expert input is reduced.Moreover, a combination of an assistive grasp device and sensor technology in a sleeve could further reduce preparation time and increase the usability for end users.

Current limitations and outlook
So far, the results of the participants with SCI cannot be generalized, as we have only investigated four participants.However, these 4 participants already showed high variation in their residual capabilities to which our models could successfully adapt.Nevertheless, it might be worth thinking about whether a larger cohort with such high intersubject variability in residual movement capabilities will provide further meaningful insights, but we will continue to increase our sample size for a better depiction of this highly heterogeneous population.Our results are also partly based on simulations of online experiments with either offline acquired data or data containing one feedback loop, as participants with SCI adapted their behavior according to the audio feedback.We still need to investigate the model performance across sessions of the same participants.Transfer of predictor models across multiple sessions may be challenging as signals and characteristic features depend on EMG electrode or IMU sensor locations, and a threshold between robustness or adaptiveness needs to be determined.We hypothesize that with our current approach, a successful translation across sessions might be feasible, e.g by using average group results as initial models for shortened co-adaption periods at the start of each session.There still remain challenges towards end user applications that include our co-adaptive closed-loop HMI.First, we found that features extracted prior to the grasp onset, in this work defined as the first touch of an object, are best suited for model training.This requires a universally applicable method to determine this time point without restricting the usability of the assistive device, for example, through pressure sensors fixed at the end user's hands in a grasp neuroprosthesis setting.Second, assistive devices differ in degrees of freedom and in the grasp types they can provide.Therefore, grasp types to be classified depend on the specific assistive devices in use.Third, assistive devices may interfere with our current data acquisition strategy.For example, neuroprostheses operating through functional electrical stimulation (FES) can alter EMG signals.Although there are already approaches on simultaneous FES and volitional EMG measurements [47,48], these approaches still need to be either refined in order to translate to everyday use for end users or changed towards a sequential system of EMG acquisition and FES application.Further work should also be conducted towards faster co-adaption to each end user's physiological capabilities and respective important features to reduce signal noise and computational load.Similarly, further in-depth analyses could determine the best combination of sensor modalities and localization that still allows models to adapt to an individual's capabilities.Although we achieved higher results with our multimodal approach, not all sensors may be necessary, and classification results may change with the number of features and/or sensors.A well-calibrated trade-off between performance and complexity may contribute to an increased level of usability and applicability.We believe that our approach is not limited to persons with SCI but could also be translated to other persons with movement impairments due to amputation or neurodegenerative pathologies.

Conclusion
With this work, we showed that a multimodal EMG and IMU data approach can be used for reach-andgrasp decoding in participants with SCI.Our combined approach performed significantly better than isolated EMG or IMU data alone (p < 0.05) and correctly predicted 102.7 ± STD 2.3 out of 105 trials in a simulated closed-loop setting for control group data and 79.3 ± STD 7.4 out of 105 trials in participants with SCI in a co-adaptive closed-loop setting.We found the 0.25 s directly prior to a grasp onset to hold the most discriminative information and achieved grand average movement vs. rest prediction accuracies above 99.64% with 100% sensitivity in both groups.Our grand average grasp prediction accuracies based on multimodal information were 75.39 ± STD 13.77% for the participants with SCI and 97.66 ± STD 5.48% for the control group, respectively.Our sessions with the co-adaptive closed-loop HMIs demonstrated the advantage of coadaptiveness in comparison to unadapted models, and we found clear feature prioritization in models of the homogeneous control group as well as uniquely, in real-time adapted models for each participant with SCI.We believe our findings can foster the development of multimodal and adaptive HMIs for persons with SCI to allow for intuitive control over assistive devices with the goal of improving personal independence.

Appendix
Figure A1.Co-adaptive grasp predictions during individual experiments of participants with SCI.Grasp predictors were retrained after every experimental block and predicted all upcoming trials.While the output of the newest trained predictor (latest predictor, green line) drove the human feedback loop, all other previously trained predictors (outdated predictors, purple lines) had no influence on the experiment.The box charts depict the distribution of the latest predictors.Their predictions are also summed up in the aggregated, normalized confusion matrices.The chance level (55.04%) was calculated using an adjusted Wald interval with α = 0.05).

Figure 1 .
Figure 1.Experimental design.Each experiment was divided into eight blocks of 1 minute rest and 5 trials per grasp type.Participants could perform a unimanual lateral grasp of a paperweight, or a uni-or bimanual palmar grasp of a jar.

Figure 2 .
Figure 2. Data acquisition targets.(a) Muscles acquired through surface EMG.Front view.(b) Locations of IMUs.Back view.

Figure 3 .
Figure 3. Co-adaptive closed loop HMI design for the experiments with participants with SCI.Our classification approach incorporated two objectives-a discrimination between movement and rest to capture grasp initiation intention and a subsequent discrimination between experimental grasp types.Co-adaption was performed through audio feedback (every trial) and model retraining (every 15 trials).

Figure 4 .
Figure 4. Behavioral analysis.Reach-and-grasp timings, evaluated with a mixed ANOVA, differed significantly with F(1, 10) = 28.6,p < 0.001.The boxplot includes the results of pairwise Tukey-Kramer comparisons between groups in each condition and conditions in both groups.

Figure 5 .
Figure 5. Best classification timepoint and modality.(a) Grasp type decoding depended on the temporal positioning of the feature extraction window and the choice of modality.Individual results (averages of 5 repetitions of 5-fold cross-validations) of each participant are depicted in grey.Colored lines indicate the group average with shaded standard deviation.The average and shaded standard deviation of each group's movement onset are shown in blue.The chance level (40.86%) was calculated using an adjusted Wald interval with α = 0.05), and the probability (prob.)level was determined to be 33.33%.(b) Boxcharts with kernel density estimations of the individual distributions of each modality and significance reporting.Results were evaluated with repeated measures ANOVA in each group and compared pairwise according to Tukey-Kramer.

Figure 6 .
Figure 6.Co-adaptive grasp predictions during experiments with participants with SCI.Grasp predictors were retrained after every experimental block and predicted all upcoming trials.While the output of the newest trained predictor (latest predictor, green line) drove the human feedback loop, all other previously trained predictors (outdated predictors, purple lines) had no influence on the experiment.The lines depict the group averages of all participants with SCI, and the box charts their respective distribution.Predictions made by the latest predictors are summed up in the aggregated, normalized confusion matrix on the right.The chance level (55.04%) was calculated using an adjusted Wald interval with α = 0.05).

Figure 7 .
Figure 7. Feature importance of movement vs. rest predictors.Feature importance was extracted as mean decrease in Gini impurity from all movement vs. rest predicting Random Forest models.Values are color coded relative to their participant-or group-wise maximum.Axis labels are only displayed for the control group average.IMU channels: Acc-acceleration.Ori-orientation.Gyr-gyroscope readings.Features: MAV-mean absolute value.MAVS-mean absolute value slope.RMS-root mean square.ZC-zero crossings.SSC-slope sign changes.WL-waveform length.WA-Willison amplitude.MNF-mean freuquency.MDF-median frequency.BP-band power.TMD-trimmed mean of density.ED-entropy of density.

Figure 8 .
Figure 8. Feature importance of grasp type predictors.Feature importance was extracted as mean decrease in Gini impurity from all grasp type predicting Random Forest models.Values are color coded relative to their participant-or group-wise maximum.Axis labels are only displayed for the control group average.IMU channels: Acc-acceleration.Ori-orientation.Gyr-gyroscope readings.Features: MAV-mean absolute value.MAVS-mean absolute value slope.RMS-root mean square.ZC-zero crossings.SSC-slope sign changes.WL-waveform length.WA-Willison amplitude.MNF-mean freuquency.MDF-median frequency.BP-band power.TMD-trimmed mean of density.ED-entropy of density.

Table 1 .
Neurological characteristics of participants with SCI.

Table 2 .
Calculation table of EMG and IMU features.We extracted all features from EMG data, and the subset of features highlighted with an asterisk * from IMU data.

Table 3 .
Absolute number of correct predictions of the co-adaptive HMI (number of trials = 105).Predictors discriminated between movement and rest (MR, first stage) or grasp types (GT, second stage).Left column entries mark complete prediction failure, and right column entries complete prediction success.Trials where the HMI could recognize the general intent to grasp but failed to predict the correct grasp type are in the middle.

Table 4 .
Co-adaptive closed loop prediction results of participants with SCI.The online experiment utilized both EMG and IMU information.The results of post hoc simulations with only EMG or IMU information are shown below the online results.Predictors were applied to trial and rest data (movement vs. rest) or only trial data (grasp type).

Table A1 .
Simulated control group results of the closed loop experiment with both EMG and IMU information.Predictors were applied to trial and rest data (movement vs. rest) or only trial data (grasp type).

Table A2 .
Simulated control group results of the closed loop experiment with only EMG or IMU information.Predictors were applied to trial and rest data (movement vs. rest) or only trial data (grasp type).