Skilled independent control of individual motor units via a non-invasive neuromuscular–machine interface

Objective. Brain–machine interfaces (BMIs) have the potential to augment human functions and restore independence in people with disabilities, yet a compromise between non-invasiveness and performance limits their relevance. Approach. Here, we hypothesized that a non-invasive neuromuscular–machine interface providing real-time neurofeedback of individual motor units within a muscle could enable independent motor unit control to an extent suitable for high-performance BMI applications. Main results. Over 6 days of training, eight participants progressively learned to skillfully and independently control three biceps brachii motor units to complete a 2D center-out task. We show that neurofeedback enabled motor unit activity that largely violated recruitment constraints observed during ramp-and-hold isometric contractions thought to limit individual motor unit controllability. Finally, participants demonstrated the suitability of individual motor units for powering general applications through a spelling task. Significance. These results illustrate the flexibility of the sensorimotor system and highlight individual motor units as a promising source of control for BMI applications.


Introduction
Brain-machine interfaces (BMIs) aim to create an artificial link between intentions and actions. By detecting user intent from neural activity, BMIs can enable symbiotic human-machine interactions that are independent of the motor system and thus have great potential to augment human functions. Proof-of-concept clinical studies have tapped into this potential to restore independence in people with severe paralysis, demonstrating systems that allowed tetraplegic people to control robotic arms and exoskeletons [1,2], navigate computers [3], and even regain control of their own paralyzed limbs through electrical stimulation [4]. However, despite decades of advances, the reach of BMIs remains relatively limited, largely caused by the current trade-off between BMI invasiveness and performance [5,6]. Intracortical BMIs demonstrate outstanding performances but present significant associated risks [1][2][3][4]7]; noninvasive BMIs, such as those based on electroencephalography (EEG), have a low barrier-to-entry, but their poor spatial resolution and vulnerability to noise artifacts have so far limited them to specialized usecases and to information transfer rates too slow to control complex devices [5].
Alternatively, user intent can be accessed at the level of the muscles via surface electromyography (EMG), a non-invasive technology whose high temporal resolution recordings of single motor units support broadly applicable yet powerful applications. Neuromuscular-machine interfaces (NMIs) built on surface EMG (sEMG) seek to detect intended actions from muscle activity, as in the case of myoelectriccontrolled prostheses where intended hand movements are decoded from the activity of a range of upper-limb muscles [8,9]. Existing NMIs can effectively link predefined sets of muscle functions to device control signals, but typically require interfacing at least as many muscles as the number of actions to be controlled. Therefore, the bandwidth of these devices is often insufficient to enable effective control of assistive devices in people with paralysis, in which only a limited number of muscles can be used as sources of control. In addition, in decoding intended movements, these NMIs do not differentiate between movements aimed at device control or for interacting with the environment, making their utility for augmentative applications limited. To effectively enable these applications typically restricted to BMIs, NMIs should thus seek to increase the number of degrees of freedom (DoF) that can be extracted per muscle.
Henneman's size principle theorizes that individual motor units within a muscle are recruited in a fixed order [10] and thus cannot be controlled independently from one another. This longstanding theory of orderly recruitment has been primarily supported by experiments assessing motor unit recruitment properties during isometric, slowramping contractions within controlled laboratory conditions [10][11][12][13][14], and is consistent with the prevailing view that motor units within a muscle are controlled by a common descending neural drive [15]. According to this understanding, the bandwidth of existing NMIs would be neurophysiologically constrained. However, out-of-order motor unit recruitments have been observed during complex motor behaviors [16][17][18], and multidimensional descending neural drives have been shown to contribute to this flexibility [18]. In addition, pioneering studies in neurofeedback reported that people can learn to volitionally control individual motor units belonging to the same muscle when provided with visual and/or auditory feedback linked to the units' activity [19][20][21][22][23]. In particular, Harrison and Mortensen reported a subject that was able to learn, within an hour of training, to isolate and produce predetermined patterns of activity in four motor units of the tibialis anterior muscle [19]. The Basmajian group expanded on these findings with a series of exploratory studies, reporting selective control of up to six motor units across a variety of participant demographics and muscles, including the tibialis anterior, biceps brachii, and the abductor pollicis brevis [23]. These results suggest that individual motor units within a single muscle could potentially provide the per-muscle bandwidth required for powering BMI applications. However, because of the qualitative and observational nature of these neurofeedback studies, the extent to which individual motor units can be volitionally controlled remains largely unclear. Indeed, previous studies did not quantify control quality or independence, test whether motor units can be recruited both exclusively and simultaneously of one another, assess the ability to skillfully modulate firing rates, examine learning capabilities over time, or compare motor unit activities between periods of neurofeedback control with relevant isometric contractions of the same muscle. Answering these questions, in addition to significantly enhancing our understanding of flexibility in the neuromuscular system, could assess the suitability of using individual motor units as the source of control for a new class of NMIs for both translational and augmentative applications.
In order to address these questions, here we utilized a neurofeedback paradigm coupled with an operant learning task to interrogate the emergence and execution of skilled, independent control of individual motor units. We devised an NMI that provides visual and auditory feedback of biceps brachii motor units in real-time using neuromuscular signals recorded from a high-density grid of sEMG electrodes and trained eight participants over six consecutive days to use this system on a center-out task requiring skilled and independent control of three motor units. Participants increased in proficiency in this skilled motor unit control both within and across days of training, with modest control of at least two motor units demonstrated even on the 1st day. Through comparisons to isometric, ramp-and-hold contractions, we provide evidence that neurofeedback enabled participants to control individual motor units outside of recruitment constraints thought to limit motor unit controllability. We then demonstrated an application of such fluent motor unit control through a speller task, in which participants used motor unit activity to navigate a virtual keyboard to spell sentences. These results characterize skillful control of individual motor units, contributing to our understanding of sensorimotor flexibility and suggesting motor unit control can be a viable paradigm for both clinical translation and human augmentation applications. Participants are seated on a chair wearing a sensorized orthosis constraining the elbow joint at 100 • and the wrist at its neutral position. Load sensors are used to measure the isometric elbow-flexion and forearm-supination forces. IMU sensors are used to track arm movements. The NMI control loop is divided into four steps. First, biceps brachii neuromuscular signals are measured using a high-density grid of 64 surface EMG electrodes. Second, an online decomposition model is used to detect motor unit action potentials from the measured signals. Third, a decoder transforms the detected motor unit activity into task-dependent neurofeedback signals. Last, auditory and visual neurofeedback signals are delivered to the participants via headphones and a computer monitor. (B) Schematic of the user interface and neurofeedback signals used during the exploration procedure. Multi-channel waveforms of the detected motor unit activity are displayed and updated at 60 Hz. Neurofeedback of the detected motor unit activity is also provided by LED-like indicators flashing when an action potential is detected. Both waveforms and unit indicators are color coded. Colored signals indicate the activity of a subset of selected individual motor units. Black signals indicate the activity of unselected motor units. Finally, light-grey signals indicate detected events that have not been categorized as motor unit activity, i.e. unsorted activity. Auditory neurofeedback signals followed the same categorization between selected, unselected, and unsorted units and consisted of 150 ms pitch-coded stimuli. (C) Center-out task neurofeedback, decoder, and targets. The activity of three selected motor units is transformed into cursor position using a population coding schema. The cursor position is indicated by a grey arrow originating at the center of the screen and represents the population vector. The same unit-specific visual indicators and auditory stimuli employed in the exploration period are used here. A total of 12 peripheral targets (T1, T2, T3, and T4), 1 center target (T5), and 1 rest target were included. (D) Center-out task protocol. The task is divided into trials. To start a trial participants need to hold the cursor within the rest target for a minimum of 2 s. A target is then selected from a performance-dependent pool of available targets. At first, only T1, T2, and T5 targets were available. T3 and T4 targets were progressively added depending on participants' performance within that day. The trial's target is displayed and the participant has 60 s to achieve it before the trial is declared unsuccessful. wrist at its natural position ( figure 1(A)). After cleaning the skin with a mildly abrasive paste and isopropyl alcohol, a high-density 64-channel grid of sEMG electrodes (GR10MM0808, OT-Bioelettronica, Torino, Italy) was placed over the short and long heads of the biceps brachii, with the proximal/distal edges of the grid positioned at approximately 60%/80% of the distance between the acromion and the distal insertion of the biceps brachii tendon [24]. Velcro straps were used to ensure a tight fit of the orthosis around each participant's arm. Markings on the skin were used to ensure stable grid positioning across days.
We next calibrated the decomposition model used to extract individual motor unit activity from the measured neuromuscular signals. This initial calibration was performed offline on a recording of 60 s, during which the participants were instructed to perform subtle biceps contractions that would activate only a few motor units. To help participants in this task, we educated participants in recognizing individual motor unit action potentials from displayed raw neuromuscular signals, and encouraged them to use this simple form of neurofeedback to gauge their muscle activity. Participants were then introduced to the exploration procedure.

Exploration procedure
A computer screen and headphones were used to provide participants with real-time auditory and visual neurofeedback of the detected motor unit activity ( figure 1(B)). Visual neurofeedback consisted of color-coded LED-like indicators that flashed when an action potential was detected and plots of the corresponding multi-channel waveforms. Auditory neurofeedback mapped detected action potentials into pitch-coded 150 ms-long stimuli. Neurofeedback signals were updated at 60 Hz. Detected activity and corresponding neurofeedback signals were divided into three categories: selected units, unselected units, and unsorted activity. Selected and unselected units represented motor unit activity successfully classified by the decomposition model, while unsorted activity represented residual threshold-crossing events that were not matched with previously recognized motor units. Selected units were assigned to unit-specific neurofeedback features (i.e. colors and pitches), while those for unselected units and unsorted activity had categorical features.
Participants were instructed to use the provided neurofeedback signals to explore covert strategies to selectively recruit different motor units-mimicking pioneering studies on individual motor unit control [19][20][21][22]-and had approximately 30 min to select and sort in order of controllability the three motor units to use in the center-out task. To guide participants in their motor unit selection, we designed an algorithm that monitored motor unit activity in realtime and suggested units showing substantial evidence of independent control. Participants could rely on this algorithm to automatically define which units to be included in the selected units category but could also include, exclude, and reorder units at will.
Throughout the exploration period, the decomposition model was periodically updated until a maximum of 25 different motor units were detected. Participants could thus use the unsorted-activity neurofeedback to steadily recruit unsorted units of interest and assist the update algorithm in detecting these units.

Center-out task
Participants controlled a computer cursor using the three motor units selected during the exploration procedure to achieve targets displayed on a screen. The activity of the selected motor units was mapped into the 2D position of a computer cursor using a population-coding strategy (figure 1(C)). Each motor unit was assigned to a unique direction by dividing the 2D space into three equal subspaces (i.e. with a 120 • angle between each other) and provided a vectorial contribution to the cursor position along this direction and proportional to its normalized firing rate. To provide intuitive feedback on this control strategy, the cursor position was indicated by an arrow-representing the population vector-originating at the center of the screen. Motor unit firing rates were computed over a rolling window of 50 bins of 16 ms (800 ms in total) using a half-Hamming window profile that gave larger weight to the most recent bins. This firing rate was then normalized between 0 and the 90th percentile of the firing rate displayed during the exploration procedure. In some cases, this normalization value was manually adjusted between 10 and 20 Hz.
A total of 13 active targets and 1 rest target were designed. Active targets included 12 peripheral targets and 1 center target. Peripheral targets were defined by polar rectangular regions with a ∆θ of 45 • and ∆r of 0.39 population-vector magnitude and were divided into exclusive targets (T1, T2, and T3) and simultaneous targets (T4), depending on their center angle: exclusive targets were centered on the assigned motor unit directions and thus required exclusive recruitment of an individual motor unit; simultaneous targets laid between the assigned directions and thus required simultaneous recruitment of two units. Peripheral targets were also divided by distance: close targets were centered at 0.395, while far targets were centered at 0.785 magnitude. To achieve peripheral targets participants had to hold the cursor position within the target for a minimum of 0.5 s. The center target (T5) was defined by a circular region located at the center of the screen and had a radius of 0.2 magnitude. To achieve this target, participants were required to recruit all selected motor units at a minimum normalized firing rate of 0.33, while also keeping the cursor within the target boundaries. In contrast to active targets, the rest target required participants to avoid motor unit recruitment by holding the cursor within a distance of 0.1 from the screen center for 2 s.
The task was divided into trials and inter-trial periods. At the beginning of each trial, an active target was randomly selected from a pool of available targets and participants had 60 s to achieve it (figure 1(D)). The rest target was then displayed and participants could initiate the next trial by completing it. To promote learning, active targets were grouped into three difficulty levels, which were progressively made available depending on participants' performance. At the beginning of each session, only the center target (T5) and the motor unit #1 and #2 exclusive targets (T1 and T2) were available. An algorithm monitored the average trial success rate over a window of 5 min and if this surpassed a threshold of three trials per minute, targets belonging to the next difficulty level were made available: T3 targets were added first, T4 targets last.
To promote engagement and incentivize learning, task and trial performance metrics were displayed on the task monitor. Finally, in addition to the arrow indicating the cursor position, participants received neurofeedback of the selected unit action potentials via the same LED-like indicators and audio stimuli utilized in the exploration procedure. Participants trained on this task for approximately 60 min per day during the 1st 3 d, and for a minimum of 30 min per day on the last 3 d of experiments.

Speller task
The same three motor units from the center-out task were then used to operate a cursor to navigate a virtual keyboard in a copy-typing speller task. The keyboard layout (OPTI-II) and target sentences mimicked those of previous BMI studies [7,25]. The keyboard divided the screen in 30 square keys (6 × 5) and included all the alphabet letters, two space keys, and two delete keys; misselection of a character required participants to select the delete key.
To facilitate navigation, the keyboard featured wraparound borders and the cursor was controlled in velocity. In particular, the population vector used in the center-out task to compute the cursor position was here used to control the cursor velocity. These design features allowed full 2D space navigation even with control of a single motor unit, though this would result in extremely low performances. Letter selection was triggered by simultaneously recruiting the three selected motor units above a threshold normalized firing rate and for a threshold amount of timesimilar to how center-out T5 targets were achieved. Firing rate and time thresholds were default to 0.5 Hz and 0.5 s, and sometimes slightly adjusted according to participants' preference.
Participants were tested in this task for approximately 30 min in the last 3 d of experiments, after training for a minimum of 30 min in the center-out task and reaching sufficient proficiency. Four participants also tested this task prior to the 3rd day, but only after completing a minimum of 60 min of center-out task. One participant only completed 1 d of the speller task.

Force-control task
In a separate task from the center-out and speller tasks, participants were instructed to perform isometric elbow flexion and forearm supination contractions to match target force profiles displayed on a computer screen. The forces measured by the sensorized orthosis were displayed in real-time by a bar indicator. Target forces followed a trapezoidal profile-with onset, hold, and offset periods each trial-and were displayed adjacently to the measured forces. To prepare participants for a change in force profile, the target force expected 1 s ahead was also displayed.
Three isometric contraction types were tested: elbow flexion, forearm supination, and simultaneous elbow flexion and forearm supination. Each contraction type was tested five times at three different loads, for a total of 45 trials. Loads of 500 g, 1000 g, and 1500 g were default but in some cases decreased to avoid fatigue (lowest maximum load of 1000 g). Trials were separated by 2 s of rest period. Onset and offset ramps each occurred over 1, 2, and 3 s for the three total loads, respectively. Hold duration was 1 s in all trial types. Trials of different types were ordered randomly.

EMG recordings
Biceps brachii EMG signals were acquired using a PZ5M neurodigitizer amplifier and an RZ2 bioamp processor from Tracker-Davis Technologies (TDT) at 12.2 kHz. The 64-channels grid of electrodes was connected via 32-channels ZIF-clip TDT headstages and Omnetics connectors. Signals were band-pass filtered between 10 and 900 Hz using a 6th order Butterworth filter. Notch filters at 60, 120, 180, and 240 Hz were also used to remove the powerline noise. Filtered signals were then used to extract 56 bipolar derivations parallel to the muscle fibers. A multichannel threshold crossing algorithm was then used to detect time windows with potential motor unit activity; thresholds were set to six times the signal's standard deviation and were calibrated at the beginning of each session using 10 s recordings during which participants were instructed to avoid biceps contractions and not move. A threshold crossing event at any of the bipolar channels triggered a shared deadtime of 20 ms that limited overall threshold crossing detection rate at 50 Hz. Critically, since multiple motor units could be detected per threshold crossing, this capped detection rate only limited the maximum detectable firing rate of individual motor units and not that of the entire motor unit population (see section 2.2.2). In addition, since firing rates of biceps brachii motor units during isometric contractions range between 5 and 30 Hz [26], this detection rate was largely sufficient to cover motor units' entire dynamic range. Threshold crossing events and filtered bipolar signals were downsampled to 2 kHz and streamed to the decomposition model. All these processing steps were performed using custom software written for the RZ2 bioprocessor, which ensured a maximum of 0.5 ms delay between acquisition and streaming.

Decomposition model
Bipolar EMG signals were decomposed into motor unit activity using a convolutive blind source separation model. This model included a previously validated offline EMG decomposition model [27] and shared similar logic to recent techniques for online EMG decomposition [28]. The offline decomposition model used convolutive blind source separation to define the motor units underlying the measured EMG signals [27]. Briefly, the filtered bipolar EMG signals were extended and whitened. An extension factor of 16 was used [27]. Next, a two-step iterative algorithm was used to find sparse components that best reconstructed the whitened data. First, a fixed-point iteration algorithm was used to estimate the next component using the logarithm of the hyperbolic cosine as a contrast function to optimize sparseness and an orthogonal constraint to promote estimates of unique sources. The logarithm of the hyperbolic cosine was used because of its superior robustness to outliers compared with simpler contrast functions [27]. Second, an iterative algorithm was used to minimize the variability of the inter-spike intervals (ISIs) of detected spike-trains. After projecting the data onto the candidate component, K-means++ (k = 2) was used to estimate a threshold on the peaks in the squared projected data. The estimated component was then refined according to those peaks. This process repeated until the ISI converged. Since the coefficient of variation for spike-trains generated by multiple motor units are intrinsically more variable than those generated by a single motor unit, this 2nd step was shown to ameliorate source estimation by exploiting the regularity of motor unit firings [27]. The resulting component was then added to the matrix of estimated components if the signal to noise ratio (SNR) of the spikes detected along this component was greater than a fixed threshold; SNR was measured using the Silhoutte coefficient and a threshold of 0.85 was used [27]. This iterative algorithm, which is described in greater detail in Negro et al [27], was repeated until a maximum of 25 sources were detected. A post-processing step was then introduced to further de-duplicate the number of components underlying the same motor units. Indeed, despite the orthogonal constraint used in the fixed-point algorithm to increase the number of unique estimated sources, this approach can lead to components capturing delayed versions of the same motor unit action potentials [27,29]. Spike-trains were thus extracted from each estimated component and only components with less than 30% of coincident spikes-as measured by the rate-of-agreement [27] across spike timings-were kept. Note that while a minor inconvenience in offline analyses, an excessive number of duplicated components would largely impact computational load required by our NMI.
The offline model was initialized on the 60 s dataset acquired at the beginning of each session. This calibration was used to compute the whitening matrix and to initialize the decomposition matrix with the 1st set of estimated sources. This whitening matrix was then fixed for the remainder of the session. A batch update algorithm was then used throughout the exploration procedure to periodically update the decomposition matrix with potential new components. To optimize computational efficiency and allow for quick model updates (update time < 30 s), instead of using the full EMG stream this algorithm only ran on the windows of EMG signals surrounding the detected threshold crossing events (10 ms before the peak multichannel amplitude and 20 ms after). The update algorithm was triggered every 750 threshold crossing events with no extracted motor units and ran until a maximum of 25 total motor units were detected or until the end of the exploration procedure.
Individual motor unit activity was continuously estimated in real-time through this decomposition model from the streamed EMG signals using the 30 ms windows surrounding detected thresholdcrossing events. For each threshold crossing, data windows were whitened, extended, and projected to the source space by multiplying each extended multichannel sample with the most current decomposition matrix. A motor unit was then considered detected if at least a sample of the squared projected data exceeded the decomposition model's threshold for a given source, determined with k-means during the offline decompositions. Using this algorithm, multiple units could be detected from one threshold crossing event. If the projected activity did not surpass any component's threshold, the event was then classified as unsorted activity. Notably, windows alignment to a particular peak had negligible impact on the detection of concurrent motor unit action potentials, even when considering that the threshold crossing detection rate was capped at 50 Hz (see section 2.2.1). Indeed, motor unit classification was based on 8 ms data bins (extension factor = 16; sampling rate = 2000 Hz) and the overlap between windows at high threshold-crossing event rates (max 10 ms at 50 Hz) ensured detection of motor unit action potentials even when occurring between two consecutive events.
Online and offline decomposition models were implemented through custom-written GPUaccelerated Python programs. All data was streamed between multiple computers with minimal latency and high bandwidth through River [30], an opensource C++ library based on Redis. Overall latency from data acquisition to motor unit activity detection was generally under 70 ms.

Motor unit selection algorithm
This algorithm monitored the dimensionality of motor unit activity throughout the exploration procedure and suggested motor units with potential for independent control. A circular buffer (size of 2 16 samples) was used to collect sorted motor unit activity. The firing rate of each motor unit was then computed over overlapping windows of 1 s with 500 ms overlap. Non-negative matrix factorization (NMF) was then used to detect motor units explaining most firing rates variance. First, components required to explain a minimum of 90% of the total firing rates variance were selected. Second, the motor unit with the largest weight for each of the selected components was chosen and used to update the subset of suggested motor units. Suggested motor units were updated every 20 s.

Force and kinematic recordings
The sensorized orthosis was custom-designed and 3D printed using a Form 2 (Formlabs, Somerville, MA) printer with standard resin. The orthosis embedded two load cell sensors (a CB6 from DACELL, Korea and a TAL220 from HT Sensor, China) to measure elbowflexion and forearm-supination forces, respectively, and inertial measurement units (BNO055, Bosch Sensortec, Germany) to capture movements. Load and IMU signals were sampled at 50 Hz using a Raspberry Pi 4. A HX711 analog-to-digital converter (Avia Semiconductor, China) was used to acquire the load data. Data was streamed online to other modules using River.

Behavioral analysis 2.3.1. Center out task day 1
Center out performance at day 1 was evaluated using the percentage of successful trials for each target category (T1-T5). A trial was considered failed if the presented target was not achieved within the 60 s of trial and successful otherwise. Participants that did not reach the 2nd and 3rd difficulty levels were excluded when analyzing the corresponding target categories (T3 and T4 respectively). Hypothesis testing was performed using bootstrapping (n = 10 000 iterations) and Bonferroni correction for multiple comparisons.

Trial performance metric
While the percentage of successful trials allows to evaluate whether independent motor unit control is possible, this metric fails to capture the quality of this control. A more holistic performance metric was thus computed to assess motor unit control quality and evaluate learning over time. This metric combined together three independent metrics using principal component analysis (PCA). The normalized distance between the cursor position with respect to the presented target center was calculated for every time point within each recorded trial; normalization was performed with respect to the maximum target distance. The average and integral of this distance for each trial were then linearized using a log transform. These metrics were used to capture the cursor error and trial duration and were the 1st two independent metrics. The 3rd metric was used to reward motor unit specificity. A specificity score was first computed for each trial's time point as a value between −1 and 1, where −1 corresponds to selective recruitment of motor units that are not required for achieving the considered target and 1 to selective recruitment of the target motor units. The mean specificity was then calculated for each trial and linearized using the logistic transform. A PCA model was then fit on all the collected trials to combine these three metrics; the single holistic metric was then the 1st component of this PCA model, standard scaled to improve interpretability of the results. Feature linearization was performed to conform with the assumptions of the statistical techniques used to analyze learning over time. These analyses excluded T5 targets.

Learning analyses
Collected center-out data are characterized by hierarchical and crossed dependencies: trials (at the 1st hierarchical level) are grouped in days (at the 2nd level) and in participants (at the 3rd level), while target categories are crossed at all hierarchical levels.
To account for these dependencies, learning analyses were performed using linear mixed-effects models (LMMs)-an extension of linear regression models that allow to separate the overall effects of a model term (i.e. the fixed effects) from the variability in the data generated by different sources of stochastic variations (i.e. the random effects) [31].
When analyzing the overall within-and acrossday learning, trial performance was modeled by the following equations representing our general LMM: where j, t, i, θ, and r refer to the participant, day, trial, target angle, and target distance indexes, respectively; γ n refers to the fixed effect estimated for the nth independent variable; µ nl xyz refers to the lth random effect for the nth independent variable caused by the random factor xyz; β n xyz refers to the combined random and fixed effects; and ε jtiθr refers to the model residuals. This model describes trial performance y jtiθr as a function of within (within cwc jtiθ ) and between-day (across awc jtθ ) time variables, an interaction term between these two variables (within cwc jtiθ across awc jtθ ), and two additional variables used to control for potential across-day effects caused by differences in number of performed trials (within mean jtθ and within mean jtθ across awc jtθ ). The withinday time variable within cwc jtiθ was calculated as the centered, normalized trial index i. For each day t, subject j, and target direction θ, trials were centered with respect to half of the performed trials. Such centering within-cluster (CWC) was used to segregate within-day effects from higher order effects. A normalization factor of 100 trials was used. The subtracted means from within cwc jtiθ were in turn CWC centered and included in the model through the within mean jtθ term, which was used to account for possible changes in performances caused by the different number of performed trials for each recording. The acrossday variable across awc jtθ consisted of the aligned and normalized day index t. Alignment was performed within-cluster (AWC) with respect to the 1st day t for which participant j performed θ targets. While for targets belonging to the first difficulty level (i.e. T1 and T2 targets) AWC had no effect, this alignment strategy allowed to take into account participants' across-day heterogeneity in reaching T3 and T4 targets, effectively comparing across-day performances with respect to the number of days of practice instead of those of experiment. This variable was normalized with respect to 6 d. Maximal random effects were used to minimize type I errors during hypothesis testing [32]. Random effects included: random intercepts for each participant (µ 00 j ), target direction (µ 03 θ ), combination of participant and target direction (µ 02 jθ ), combination of participant, target direction, and day (µ 01 jtθ ), and combination of target direction and distance (µ 04 θr ); and random slopes for both the within-and across-day time variables (µ 10 jtθ and µ 20 jθ , respectively). Random effects were modeled as zero-centered normal distributions with estimated standard deviations σ and optional correlation parameter ρ. The centering and alignment choices used for within cwc jtiθ and across awc jtθ made the fixed-effect of the model intercept γ 0 to capture the overall performance of a general participant on the center-out task at day 1. The modeled fixed effects for the withinand across-day time variables represented the overall improvement in performance a general participant would obtain in the center-out task by training over 100 trials and 6 d, respectively.
Learning analyses performed for each of the selected motor units separately were carried out using a similar LMM, which included the same fixed-effect terms but reduced random-effects: y jtir = β 0 jtr + β 1 jtr within cwc jtir + β 2 jr across awc jtr + γ 3 within cwc jtir across awc jtr + γ 4 within mean jtr + γ 5 within mean jtr across awc jtr + ϵ jtir , β 0 jtr = γ 0 + µ 00 j + µ 01 jtr + µ 02 jr , where terms follow the same conventions as in the previous model. In particular, since different models were used to evaluate learning over T1, T2, and T3 targets, random effects that were used to account for variations caused by different target directions were removed. Random slopes for the within-day term were computed for each combination of participant, day, and target distance, while random slopes for the across-day term were computed for each combination of participant and target distance.
Analyses of participants' performance on the T4 targets were conducted using a generalized LMM (GLMM) with a Poisson link function. Specifically, the rate of successful T4 trials over time was modeled as: where terms follow the same convention as above, task_duration jt indicates the duration in hour of the center-out task performed by participant j at day t, and day jt indicates the tth experiment day of participant j. All models above parameters were fitted using the restricted maximum likelihood (REML) approach. Confidence intervals used for hypothesis testing were computed using the profile method. Model assumptions were tested using the White's Lagrange Multiplier test, for testing heteroskedasticity of the residuals, and the D'Agostino and Pearson's test, for testing residuals normality. All models (general, T1, T2, T3, and T4 models) displayed homoscedastic residuals (p = 0.08, p = 0.9, p = 0.3, p = 0.9, and p = 0.07, respectively), but only the residuals for the GLMM resulted normally distributed (p = 0.65 for the T4 model, p < 0.001 for the others). However, LMMs have been shown to be highly robust to violations of distributional assumptions and the kurtosis ([1.2, 0.77, 0.5, 2.5]) and skewness ([0.23, −0.19, 0.15, 0.7]) of our models with non-normal residuals' fell largely within acceptable ranges, shown to have minimal impact on the validity of LMMs estimates [33].

Kinematic analyses
Measured IMU Euler angles were preprocessed using an artifact removal algorithm and a 6th order Butterworth low-pass filter at 6 Hz. Artefact removal was used to ignore samples with prominence superior to 10 • , which accounted for less than 0.1% of all samples. PCA was then used to align the rotational axis of the IMU sensor to the axis of largest variation. Kinematic analyses during the center-out task used the 1st principal component to compute the mean absolute velocity (MAV) for trial and inter-trial periods. The median MAV was then computed for each day, each participant, and trial category and used to evaluate target-specific movement strategies. Statistics of active targets were compared with respect to those of rest targets; hypothesis testing was performed by bootstrapping (n = 10 000 iterations) the distribution of the paired differences for each recording and using Bonferroni correction of the estimated confidence intervals for multiple comparisons. Decoding analyses were performed using a multilayer perceptron (MLP) with one hidden layer of 100 artificial neurons having hyperbolic tangent as activation function. For each session, a MLP was trained to predict the cartesian coordinates (x, y) of the centerout task cursor from the measured Euler angles and their derivatives. Time bins during rest trials were excluded. Data were then shuffled and split in training (70%) and testing (30%) sets. Decoder performances were evaluated on the testing sets using the average R 2 of the two output variables.

Speller
A common metric for assessing information throughput in self-paced BMIs is the achieved bitrate, which combines the number of possible symbols to select (i.e. the number of characters on a keyboard) with the net number of correct symbols selected per second [34]. This metric is typically considered an underestimate of the true information throughput of a device, as it penalizes errors relatively harshly compared to other information throughput metrics [34]. It is defined as: where S c is the number of correct symbols transmitted, S i the number of incorrect symbols transmitted, and N the number of symbols. In our case, N = 27, due to the 26 letters and the 'space' character on the keyboard (excluding the delete key).
Smoothed bitrates were computed from 5 min sliding windows taken every 30 s; peak bitrate was the maximum smoothed bitrate value during a given session. Average bitrate was the achieved bitrate B computed over the entire spelling session. Correct characters per minute were computed similarly as the net number (correct symbols minus incorrect symbols) of correct characters spelled. Changes in average bitrate over days of training were modeled with a LMM where the number of days of training were centered withinsubject to account for differences in amount of training between subjects. This model fit a fixed-effect slope and intercept for days of training and was fit using the REML approach. Model assumptions were tested as described in the above learning analyses.

Motor unit activity analysis 2.4.1. Pooled motor unit decomposition
A separate offline motor unit decomposition was run for the EMG collected during the force-control task with the same parameters as the decompositions run online. Then, for each day, the motor units identified across both the online and offline decompositions were pooled together, and all of the EMG data for that day was then re-decomposed with these motor units, yielding a superset of motor unit action potential timings relative to those detected online.
Motor units exhibiting more than 30% of coincident spikes, according to the rate-of-agreement between action potential timings, were considered duplicates, and only one of the duplicate units was retained. No duplicates were found within selected motor units in any session. All analysis that used firing rates uses these pooled motor units. This methodology allowed for motor units to be identified for analysis purposes even when they had not been identified during the online sessions.

Integrated EMG (iEMG) and motor unit firing rates
The iEMG for channel i at time t was computed as the sliding window sum of rectified EMG: where N was fixed as the number of samples corresponding to a 200 millisecond window. The data was then downsampled by a factor of 25 to approximately 81 Hz. Smoothed motor unit firing rates were computed from the pooled motor unit firings and were computed in the same manner as computed online for the center-out task. Firing rates for pooled motor units were computed as in the center-out task. For analysis based on firing rates during the center-out task, any time bins occurring during T5 or rest trials were excluded. For analysis during the speller task, time bins used for letter selection were explicitly excluded as well. When necessary, both firing rate and iEMG were linearly interpolated in time in order to align with other streams of data (e.g. aligning with load sensor data).

Exploration period analysis
In order to identify groups of units that were often mutually active during the exploration period, motor unit activity was decomposed into three separate components via NMF. NMF aims to find two low-rank matrices, W and H, from a non-negative data matrix X such that: is minimized and such that W, H are also nonnegative. NMF was performed via a coordinate descent solver with NNDSVD initialization. Since the relative scales of the projections (W) and its components (H) are typically arbitrary, we resolved ambiguity by scaling each component to unit L2-norm and scaling its corresponding transformation by the appropriate reciprocal factor. We then computed the CIFT for each of the three components relative to one another, as described in the following section.

Cumulative independent firing time (CIFT) metric
For analysis of the speller and exploration period, a simple time-based metric, the CIFT, was devised. CIFT is defined as the fraction of total time a motor unit was independently active relative to the total time the motor unit was active, and thus takes values between zero and one. A motor unit was considered 'active' if its smoothed firing rate exceeded 5 Hz, and was considered 'independently active' if both it was active and no other motor units had firing rates simultaneously exceeding 5 Hz. This 5 Hz threshold corresponds to the approximate physiological minimum motor unit firing rate [35]. Throughout this analysis, we utilize the CIFT as a general measure of relative independence of motor units and use it across various contexts. Note that our use of CIFT in the exploration period extends its use from comparing motor units to comparing NMF components.

Dimensionality computation
The participation ratio (PR) was computed to quantify the dimensionality of the iEMG and firing rate data [36][37][38]. The PR is a metric computed on the covariance matrix of a feature and represents the approximate dimensionality of the manifold spanned by that feature; a higher PR means more principal components are needed to explain a given proportion of the feature's variance. PR is defined as: where λ C (i) is the ith eigenvalue of the covariance matrix C of the corresponding feature (iEMG or firing rates). PR was computed across the periods spanning the force-control tasks, center-out tasks, and speller tasks. In our data, the PR corresponded to the number of principal components needed to explain approximately 83% of the total feature variance. The relationship between selected and unselected motor units during the force-control and center-out tasks was characterized using linear regression. Linear regression was used to predict the unselected motor units' activity from the activity of the selected ones. The quality of this prediction was characterized by the coefficient of determination (R 2 ).

Recruitment thresholds
Recruitment thresholds for each motor unit were computed for both elbow flexion and wrist supination from force-control task data. First, force data from load sensors was smoothed with a median filter and normalized within each session to values between zero and one. Then, for force-control task trials in which elbow flexion (forearm supination) was the sole movement indicated, the recruitment threshold for a particular motor unit for elbow flexion (forearm supination) was identified as the average across trials of the measured load at the beginning of the 1st occurrence of three consecutive firings with ISI less than 200 ms.

Control spaces
To compare possible joint firing rates observed across the three motor units selected for the center-out, a control space was computed for firing rates spanning the center-out and force-control tasks. First, each unit's firing rate was normalized to 1.5 times the normalization value used for that particular unit in the center-out task, in order to focus analysis on the same, task-relevant firing rates between tasks, firing rates above this normalization value were discarded. Normalized firing rates were then binned into 1 of 10 equally-spaced bins between 0 and 1, with any time points exceeding a normalized value of 1 ignored. Each time point was thus converted into a 3D point representing the firing rate bins of each of the three selected motor units. From this, control spaces for the center-out and force-control tasks were then computed as the set of joint firing rate bins that occurred for at least 160 ms (i.e. ten bins) in a given task. The maximum number of populated bins, i.e. the largest possible control space, was thus 10 3 .
To facilitate 2D visualization of the 3D control space, NMF, as used in the exploration period analysis, was performed for each session on the forcecontrol's firing rates to compute two non-negative components that best factorized the data. Components were ordered such that component #1 captured more variance in the data than component #2. These components were then normalized for the day such that the maximum projected value across either component was 1. Finally, firing rates for the force-control and center-out tasks were then projected onto these two components. After this normalization, the control space was then computed as above, i.e. component projections were binned between 0 and 1 and counted.
Ordinary least-squares was used to determine the best-fit line between mean center-out task performance and the relative fraction of control space bins that were novel in the center-out. For performance analysis focusing only on T1 and T2 targets, the above control space computation was repeated with only the 1st two selected motor units as opposed to all three.

Statistics
Statistical tests, their significance values, and the relevant number of samples are reported in the appropriate figure legends and/or relevant method section. Error bars used in point-plots represent 95% confidence intervals. No data were excluded from the analyses, unless specifically reported.

Results
We devised an NMI capable of providing real-time visual and auditory neurofeedback of biceps brachii motor unit action potentials ( figure 1(A)). This NMI measured neuromuscular signals using a highdensity grid of sEMG electrodes and used previously validated blind source separation and classification techniques to decompose these signals into individual motor unit action potentials in real-time [27,28]. After a brief initialization period for the decomposition model, we first instructed participants to use the NMI's neurofeedback to explore covert strategies to control individual motor units independently from one another. The goal of participants during this exploration procedure was to find and sort in order of controllability the three motor units they felt had the highest potential for independent control ( figure 1(B)). A motor unit selection algorithm highlighted motor units with potential for independent control and guided participants in this task. After this exploration period, participants' ability to control their selected motor units was tested in a center-out task (figures 1(C) and (D); supplementary video 1 (available online at stacks.iop.org/JNE/ 18/066019/mmedia)). A population-coding strategy was used to map motor unit activity into the 2D position of a computer cursor, and participants had to operate this cursor to achieve the displayed targets. Twelve peripheral targets were used to evaluate whether participants could recruit the selected motor units exclusively of one another (T1, T2, and T3 targets) and simultaneously in combinations of two (T4 targets) and could regulate the firing rate of the recruited units (close and far targets, figure 1(C)). T1, T2, and T3 targets were ordered such that T1 corresponded to exclusive recruitment of the subjectively easiest motor unit to activate independently and T3 the subjectively hardest. A center target requiring participants to coactivate all the selected units at a similar intensity (T5 target) was also used. These targets were grouped into three difficulty levels, with targets of increasing difficulties becoming available after reaching an average success rate greater than three targets per minute on the tested targets ( figure 1(D)). We used this paradigm to train eight participants over six consecutive days. Participants' arms were constrained to fixed elbow and wrist angles via a sensorized orthosis for the entirety of each session. Additionally, while we did not explicitly track motor units across days, we used markings on skin to ensure consistent electrode positioning.

Skilled independent control of individual motor units on day 1
We found that participants displayed independent control over selected motor units already at day 1 ( figure 2). In particular, participants successfully completed an average of 95.6% and 79.2% of the presented T1 and T2 targets on day one, demonstrating independent control of motor unit #1 and #2, respectively (figures 2(A)-(C), p < 0.001 when testing for % successful trials > 0). All but one participant surpassed the three targets per minute threshold in success rate required to enable T3 targets, and half of the participants subsequently reached sufficient proficiency to also enable T4 targets (figure 2(C)). Participants encountered no difficulty in performing T5 targets, succeeding in all the corresponding trials. We also found no statistically significant difference in the percentage of correct trials between targets with different distances (p > 0.05 for each target category, figure 2(D)). These results demonstrate that participants, without any prior training, can gain skilled independent control of two or three motor units within a single session, suggesting some level of latent flexibility in the sensorimotor system.

Learning over time
We next evaluated how participants' performance evolved over time. For this, a trial performance metric was first computed, which embedded information regarding the average distance of the cursor from the target, trial duration, and participants' ability to selectively recruit target-specific motor units (figures 3(A) and (B)). A LMM was used to predict trial performance as a function of time, while controlling for possible variations between participants, days, and targets. Participants' performance increased both within (p < 0.001) and across days (p < 0.006), with fixed effects equivalent to an increase in performance of 1.4 standard deviations over 100 trials and of 0.4 standard deviations over the 6 d, respectively (figures 3(C) and (D)). The fixed-effect for the interaction between the within-and the across-day time variables was non-significant (p = 0.094). The model intercept corresponded to an average successful trial rate of roughly 95% (standardized performance of −0.44, figure 3(A)), confirming the previous analyses indicating successful task performances already at day 1.
Target-specific models were then built to better evaluate the effect of training on participants' ability Second row, bipolar surface EMG signals from the three channels that best discriminate the activity of the selected motor units and relative raster plot of the detected motor unit firings. Third row, cursor position (r and θ, solid black traces) and targets (colored boxes) displayed in polar coordinates; grey dotted lines indicate interpolated values when θ is undefined (r = 0). Bottom, arm position and angular velocity about the two axes of largest variation (PC #1 and #2). Grey-shaded areas crossing the different plots indicate ongoing trials and the relative target; empty spaces between these areas indicate rest targets. (B) Median (lines) and 95% confidence interval (shaded areas) of the selected motor unit waveforms measured from the EMG channels in (A). (C) Summary statistics of the 1st training day. Left, box-plots representing the percentage of correct trials for each of the performed targets and participants. * indicate a significant difference from 0, p < 0.0001. Middle, box-plots representing the number of trials performed for each of the performed targets and participants. Right, medians (black lines) and 95% confidence intervals (shaded areas) of the number of participants that successfully performed at least one trial for each target category. (D) Effect of target distance on percentage of correct trials. Colored point plots indicate the medians and 95% confidence intervals of the percentage of correct trials for close and far targets, for each color-coded target category. Light-grey scatter plot and box-plots report the raw data points and their distribution, respectively. No significant difference was found between targets of the same category but different distance (p > 0.5, n for each target category is indicated in (C)).
to control the three selected motor units exclusive of one another (T1, T2, T3 targets). Results showed significant across-day learning for all three targets, but only significant within-day learning for the 1st two motor units, highlighting the importance of multiday training to enable the emergence of skilled control of multiple individual motor units ( figure 3(E)). The interaction between learning within and across days was significant for T1 targets (p = 0.028) but not for T2 and T3 targets (p = 0.2 and p = 0.67, respectively). In addition, model intercepts showed that participants' performance on T1 and T3 targets were respectively higher and lower than average (T1 intercept > 0, T3 intercept < 0, p < 0.05), indicating that participants accurately ranked their motor units in order of controllability following the exploration procedure.
We finally analyzed how participants' performances on the simultaneous targets (T4) evolved over time. Since every participant did not reach these targets every day, only across-day learning was analyzed. Specifically, a generalized LMM was used to evaluate While the trial duration is the same for both (60 s), the holistic metric indicate better performance for trial #2, properly capturing differences in cursor trajectories between these two trials. Similarly, trials #3 and #4 are similar in duration but different in performance. Trial #5 reports an example of a high performance trial. (C) Regression lines of the LMM used to evaluate overall learning within-and across-day (n samples = 5249). Thick black lines represent the regression lines of the within-and across-day fixed-effects, i.e. the effects that are generalized across participants, sessions, and targets; shaded grey areas indicate the 95% confidence intervals. Thinner, colored lines represent the fitted regression lines for each participant and target category. (D) Fixed and random effects for key model parameters. The intercept indicates the performance at day #1. The interaction is between the within-and the across-day time variables. (E) Fixed and random effects for key parameters of the models used to evaluate unit-specific learning behaviors (n samples = 1311, 1230, 1050, for the T1, T2, and T3 models, respectively). (F) Success rate of T4 targets across days fitted using a Poisson generalized LMM (n samples = 48, fixed effect p = 0.013, left) and average success rates across all days for T1, T2, T3, and T4 targets reported for qualitative comparisons (right). Left, the thick line indicates the fixed-effect regression line; thinner lines indicate the regression lines for each participant; dots indicate the raw values. Stars indicate a statistically significant difference from 0: * indicates a p < 0.05, * * indicates a p < 0.01. Right, small grey dots indicate the averages for each participant; colored dots and bars represent the across-participants medians and corresponding 95% confidence intervals, respectively. how the rate of successful trials evolved across days ( figure 3(F)). The fixed effect was significant, indicating an overall increase in the success rate across all participants (p = 0.016).
These analyses demonstrate that by the end of the 6 d of training all participants gained skilled independent control of the selected motor units (figures 3(C)-(E)). The increase in performance across days also shows that learning is robust to changes in recording setups, suggesting strong potential for an NMI that would exploit this strategy to extract volitional control signals.

Exploration and acquisition of independent motor unit control
We then evaluated the role of the exploration period, occurring immediately before the center-out task, in the acquisition of independent motor unit control. Due to the unstructured nature of the exploration period, motor unit firing rates were first decomposed into separate components via non-NMF to identify groups of units that were often mutually active. The number of components was fixed to three, aligning with the instructions given to the participant to ultimately select three representative motor units. Then, the CIFT was computed for each component as the fraction of time a component was independently active relative to the overall time in which it was active (figures 4(A) and (B)). The three components were then ordered in descending order by the CIFT value 2 min into the exploration period, and CIFTs were compared between this initial point and their final values ( figure 4(B)).
The CIFT increased significantly over the course of the exploration period (figures 4(B) and (C)). The overall mean CIFT across the three components increased from 0.40 after the 2nd minute of exploration to 0.51 at the end (p < 0.0001, figure 4(C)). The 1st component (C1) was activated nearly completely independently at the beginning of the exploration period, emphasizing the level of ease in attaining independent control in one set of motor units. However, C1 then began to co-activate more throughout the exploration period as the participant explored strategies for activating other sets of units, as indicated by a decreasing CIFT (p < 10 −5 ; figure 4(C)). On the other hand, the other two components (C2 and C3) increased in independent activation over time (p < 0.05; figure 4(C)), illustrating a progressive learning process.
We next asked whether participants' motor unit control in the exploration period improved across days. There was a significant increase in the mean CIFT for exploration periods across a participant's 6 d of training (p = 0.017; figure 4(D)). Participants thus demonstrated across-day improvements in independent motor unit control in both the center-out task and the exploration period. Finally, the mean CIFT displayed during the exploration period was found to have a strong positive correlation with the center-out task performance of the same day (fixed-effect slope: 1.61; p < 10 −6 ; figure 4(E)). Taken together, these results characterize the within-day and across-day processes by which participants acquired independent control of motor units.

Muscle activity dimensionality
Participants' success in the center-out task required independent motor unit control, indicating that the activity of the selected motor units lay along a multi-dimensional manifold. To evaluate how this differed from isometric motor behaviors typically used to study motor unit recruitment properties, each day participants were tested in a force-control task, using the same experimental setup as in the rest of the session. Here, participants were instructed to match displayed force profiles by performing isometric, ramp-and-hold contractions in the two primary movement directions of the biceps, elbow flexion and forearm supination [39] ( figure 5(A)). Participants accurately reproduced the target forces (mean normalized r > 0.95, figure 5(B)).
To analyze the dimensionality of motor unit activity between tasks, we computed the PR of motor unit firing rates ( figure 5(C)). The PR measures the spread of the variance captured by each of the dimensions of the eigen-spectrum [36,38] and can be envisioned as a continuous interpolation of the number of principal components needed to explain 80%-85% of variance (83% in our study's data).
We found that motor unit firing rates had a higher average PR during the center-out task than during the force-control task (p < 0.0001; figures 5(D) and (E)). PR between tasks significantly increased whether considering solely the three selected motor units or the remaining unselected motor units, signifying an increase in dimensionality across the entire population of motor units (p < 0.0001; figure 5(E)). In addition, selected units' firing rates could predict the concurrent firing rates of the unselected motor units fairly well (mean R 2 > 0.56 for both tasks) through a linear transform, indicating strong correlations between activities of the two groups (p < 10 −10 different than zero; figures 5(F) and (G)). However, for the same population of units, the R 2 metric was lower in the center-out task, indicating a decoupling between selected and unselected units (figure 5(G); p < 10 −10 ). Relatedly, the PR during the center-out task increased over the 6 d of training in the selected motor units but not in the unselected motor units (p < 0.01 in selected, p = 0.68 in unselected; figure 5(H)).
Finally, we compared motor unit firing rates dimensionality with that computed on the iEMG, a commonly used feature for measuring overall muscle activity. PRs computed on the iEMG showed similar across-task differences, yet PR increased more for firing rates than for iEMG (p < 0.01; figures 5(I) and (J)).
Taken together, these results reveal the centerout task enabled both a significant, population-level increase in dimensionality relative to during stereotyped, isometric contractions and an increased decoupling between unselected and selected motor unit populations. Differences in dimensionality measured using firing rates and iEMG also suggest that individual motor unit resolution recordings are necessary to fully appreciate muscle activity dimensionality.

Motor unit recruitment order
Our results suggest that recruitment order of biceps brachii motor units might be more flexible than previously thought and that neurofeedback may enable motor unit recruitment that significantly differs from those observed during typically-studied, isometric, slow-ramping contractions. To evaluate this divergence in motor behaviors, we compared the stability of motor unit recruitment order across tasks.
We first assessed recruitment thresholds of selected and unselected motor units during the forcecontrol task. Flexion and supination recruitment thresholds for all units spanned a wide range, distributed in agreement with the common model of motor unit frequency distribution skewing towards more lower-threshold units within a muscle [40] (figures 6(A) and (B)). 97% of motor units selected for the center-out task were also detectable during isometric muscle contractions; the remaining 3% were not recruited during flexion or supination contractions possibly due to small changes in postures that often occurred between tasks, and were excluded from the following analysis. 12% of selected motor units were recruited exclusively during either flexion or supination contractions, and 33% had categorically different recruitment thresholds between flexion and supination contractions. This varied recruitment order is in support of existing studies reporting biceps motor units can be recruited selectively for flexion or supination [13,41,42] (figure 6(B)).
We then asked if the joint firing rates of selected motor units observed during the center-out task overlapped with those observed during the force-control task. In order to focus on the same firing rate regime across the two tasks, the firing rates of each of the three selected motor units were first normalized to 1.5 times the normalization value used for that unit in the center-out task, discarding firing rates above these values. Then, after binning these normalized firing rates, a control space was defined for each session's force-control and center-out tasks as the set of joint binned firing rates observed for at least 160 ms during each task (figure 6(C)). Since a particular point in control space could be populated with only a nominal amount of time, the control space served as a liberal and firing rates (right). The across-task increase in PR in iEMG was less than that of firing rates (p = 0.003, paired t-test, n = 48). estimate of putatively achievable joint firing rates and, consequently, described possible motor unit recruitment orders achievable in the two tasks. The control space of the force-control task spanned a small fraction of the possible controllable space (8.9%, figures 6(D) and (E)), suggesting the force-control task activity of the three selected motor units were largely stereotyped and low-dimensional, in agreement with the above dimensionality analysis. In contrast to this, the control space of the center-out task occupied significantly more of the possible controllable space (38.1%, p < 10 −5 ; figures 6(D) and (E)). Furthermore, if elbow flexion and forearm supination were effective strategies for individual motor unit control, one might expect firing rates of highperforming center-out sessions to overlap in control spaces between tasks. In opposition to this, centerout task performance increased proportionally to the relative increase in control space achieved during the center-out task, further emphasizing that individual , Visualization of the control spaces populated during the CO task, the FC task, and their difference. For visualization purposes here, NMF was first used to reduce the selected units' firing rate dimensionality from 3 to 2 before computation of the center-out (top-left) and force-control (top-center) control spaces. Component #1 was defined as the component out of the two that explains more variance. For each session, points in the session's center-out control space not seen in that session's force-control control space were termed 'novel' , and then the percent of sessions in which a particular point in control space was novel was computed (top-right). Gray squares indicate control space points in which fewer than 5% of the sessions had that point populated. Bottom: variance of original firing rates explained by the two NMF components during the force-control task (mean 97.5%) was significantly (paired t-test; p < 10 −5 , n = 48) higher than that during the center-out task (mean 77.6%). (E) Proportion of populated control space by firing rates during the (left) and CO (right) tasks. FC control space was significantly lower than CO control space (8.9% FC vs 36.1% CO, paired t-test, p < 10 −5 , n = 48). (F) Relationship between center-out performance and the relative fraction of all control space bins populated across both tasks that were novel during the CO task. Regression lines had a significant positive slope both when comparing all three selected units' targets to a 3D control space (blue, p = 0.012 slope > 0, n = 48) and when comparing only the 1st two units' targets to their respective 2D control space (purple, p = 0.014 slope > 0, n = 48). Individual dots represent a session. Stars indicate statistically significant slopes different than zero (p < 0.05). motor unit control did not solely rely on elbow flexion and forearm supination primitives, even for the higher-performing sessions (p = 0.012, figure 6(F)). This relationship holds even when only considering the easiest two targets (T1/T2) and the control space across the two respective units (p = 0.014, figure 6(F)).
These results demonstrate that, despite known task-dependent recruitment order variation in the biceps brachii, motor unit recruitment during the neurofeedback task significantly differed from that measured during elbow flexion and wrist supination isometric contractions, suggesting independent motor unit control involved an expansion of activity outside of these two existing motor primitives.

Confound analyses
While participants' elbow and wrist joints were constrained by the orthosis, gross movements at the level of the shoulder or the spine could have affected motor unit detection quality. To control for this potential confound, in addition to instructing participants to minimize movements when trying to control motor unit activity, we recorded arm kinematics and analyzed arm movements throughout the center-out task. The rotational axis of the kinematic sensors was first aligned to the axis of largest variation, with 87% of movement occurring around a single rotational axis ( figure 7(A)). Data along this axis were then used to quantify movement amplitudes throughout the center-out task. In particular, the MAV during trial and inter-trial periods was used to measure the overall movement observed across the different task conditions. For each target, we then computed the within-day median and used this statistic to evaluate possible movement strategies. Results highlight minimal movements across all conditions, with a grand median value of approximately 0.48 • s −1 ( figure 7(B)). When comparing the statistics of active targets to the rest targets (i.e. the inter-trial periods), we found a statistically significant increase in median MAV only during T5 targets (p < 0.001, figure 7(B)), highlighting how the nonspecific motor unit recruitment required by these targets pushed participants to perform vigorous muscle contractions to obtain the target as quickly as possible. In contrast, we found a significant movement reduction during both trials for T1/T2 targets and the rest trials (p < 0.001 and p < 0.01, respectively) and no significant differences between trials for the T3 target and rest trials, confirming that participants minimized body movements throughout this task. We then evaluated whether participants relied on specific movements to independently control the selected units by trying to decode the cursor position during the center-out task from arm kinematics using a MLP. Decoding performances were poor (median R 2 = 0.165, figure 7(C)), indicating weak correlations between arm kinematics and cursor position. These combined results provide compelling evidence that the independent control of single motor units observed throughout the center-out task was neither the result of movement artefacts nor relied on specific movements.
Another confound that could have facilitated independent motor unit control is the presence of crosstalk from neighboring muscles in the recorded neuromuscular signals. Aside from the biceps brachii, the brachialis is the next most likely muscle to be recorded by our electrodes due to its proximity; however, while the biceps brachii is known to participate in both flexion and supination, the brachialis participates only in elbow flexion [43,44]. In order to assess recordings for brachialis contamination, we computed the correlation of each channel's iEMG to flexion and supination forces during periods in the isometric contraction task where these task-oriented contractions were tested separately ( figure 7(B)). Correlations for flexion and supination were averaged within the three groups of channels most vulnerable to brachialis contamination: the column located most externally (i.e. nearer to the long head of the biceps), the column located most internally, and the distal row of channels. Mean correlations for all channel groups remained relatively high (>0.7) across both flexion and supination. While spatial differences in correlations are expected even within the biceps brachii, channels primarily recording from the brachialis should display a marked drop in supination correlation during supinating contractions [44]. The high correlations for both flexion and supination suggest brachialis contamination in our recordings was minimal and that the recording grid was primarily placed over the biceps brachii. Taken together, these results suggest that movement artifacts and crosstalk contaminations are unlikely to have significantly affected the validity of our results.

Speller task
We then demonstrated a proof-of-concept of the translational potential of an NMI enabling skilled independent motor unit control. To demonstrate The PR of firing rates during the speller task significantly increased relative to that day's force-control task (paired t-test; p < 0.0001, n = 24) but was not statistically different than that of the center-out task (paired t-test; p > 0.05, n = 24). (F) Mean CIFT metric computed within the three selected motor units for each speller task period increased relative to the mean CIFT during the center-out task (paired t-test; p < 0.0001, n = 24). a clinically relevant application, participants were tested on a commonly used copy-typing speller task requiring point-and-click navigation of a virtual keyboard [7,34,45] (supplementary video 2). This speller task utilized the same selected motor units from the center-out task but, as opposed to the center-out's position decoding, instead translated the normalized motor unit firing rates into the velocity of an on-screen cursor ( figure 8(A)). Navigating this cursor on a virtual OPTI-II keyboard displayed on the computer monitor, participants copied sentences by controlling motor units independently for both cursor movement and cursor clicking [7,25] ( figure 8(A)). The keyboard featured wraparound borders, which in combination with the cursor's velocity control allowed for full 2D navigation even with a single motor unit. We reasoned this more permissive control strategy to be better suited for translational applications compared with the control strategy used in the center-out task. Cursor clicking was triggered by simultaneously recruiting all the selected motor units, similar to achieving the centerout T5 target. Participants performed the speller task after at least 30 min of center-out task execution on the last 3 d of training, plus on any prior days in which they felt confident with their performance and completed a minimum of 60 min of recording.
Information throughput was assessed with the achieved bitrate, a conservative estimate of the true throughput of an assistive device [34]. Average and peak bitrates on the speller task were promising: the mean average bitrate on the last day was 0.43 bits s −1 -corresponding to 5.41 correct characters per minute at a 92.2% accuracy-with a mean peak bitrate of 0.55 bits s −1 ( figure 8(C)). Participants significantly increased their average speller bitrates over days of training, echoing similar acrossday learning as seen in the center-out tasks and exploration periods (p = 0.005 figure 8(D)). Dimensionality as measured by the PR significantly increased (p < 0.0001) during the speller task relative to the isometric contraction task and was not significantly different than that of the center-out task (p > 0.05), indicating participants used a strategy based on multidimensional independent motor unit control ( figure 8(E)). Participants' strategies for moving the cursor leaned more towards recruiting the three individual motor units exclusively of one another than simultaneously in combinations of two (mean CIFT speller > mean CIFT centerout, p < 0.0001), agreeing with the increased difficulty observed during T4 targets in the centerout task requiring simultaneous unit activation ( figure 8(F)). Taken together, these results demonstrate the translational potential of an NMI enabling skilled independent motor unit control.

Discussion
Leveraging a non-invasive neurofeedback paradigm, we probed volitional control of individual motor units within the biceps brachii. Over 6 d of training, participants steadily improved performance in a center-out task requiring both exclusive and simultaneous control of three motor units. We found that the dimensionality of motor unit activity during this task exceeded that measured during stereotyped, isometric muscle contractions and provided compelling evidence that this independent motor unit control involved changes in motor unit recruitment order relative to such contractions. Finally, demonstrating an application of this NMI through a speller task, we showed that participants could use this acquired motor unit control in a task requiring continuous, multi-dimensional control. Here we discuss the significance of these results for motor control theories and translational applications.

Skilled independent control of individual motor units
Volitional control of individual motor units was first reported in pioneering neurofeedback studies in the 1960s and 1970s [19][20][21][22][23]. In these studies, the raw electrical signals measured from intramuscular electrodes were used to provide participants with visual and/or auditory neurofeedback signals on the underlying motor unit activity. Using this neurofeedback system in unstructured tasks similar to our exploration procedure, authors reported that participants were able to selectively activate individual motor units in the abductor pollicis brevis [20], extensor digitorum [21], and the tibialis anterior muscles [19] and able to vary the firing rate of isolated motor units on command [23]. Despite this initial interest, the extent to which individual motor units can be volitionally controlled remained largely unclear, as these initial studies did not report quantitative measures of control quality or independence, assess independence of motor units by testing both exclusive and simultaneous motor unit activation, quantify the ability to modulate firing rates, examine learning capabilities over time, or compare periods of neurofeedback motor unit control with naturalistic, isometric contractions of the same muscle.
Here, we found that individual motor units can be controlled independently from one another and that control proficiency can be improved with training. In particular, we showed that over 6 d of training in a center-out task, participants progressively acquired skilled independent control of three motor units of the biceps brachii. This skilled control was evidenced by participants' ability to control each of the selected units' firing rate both exclusive of (T1, T2, and T3 targets) and in combination with other units (T4 targets) to achieve targets at different distances from the center of the screen. Critically, not shown in previous neurofeedback studies, participants' abilities to both exclusively and simultaneously activate the three selected motor units demonstrate the existence of multiple descending neural drives simultaneously converging onto a single motor pool. Through simultaneous recordings of non-selected motor units, dimensionality analysis also revealed that selected units were not activated in complete isolation but instead activated in conjunction with other units. Finally, through comparisons of center-out task firing rates to those observed throughout the comprehensive combinations of forearm supination and elbow flexion of the force-control task, we demonstrated that neurofeedback enabled motor unit activity patterns expanding beyond recruitment constraints previously thought to characterize neural control of motor units. These results present an unprecedented level of control over individual motor units belonging to the same muscle, bolstering the characterization of selective motor unit activation documented in previous studies and expanding our understanding of the flexibility possible in neuromuscular control.

Mechanisms of independent motor unit control
It is currently accepted that the orderly recruitment of motor units within a muscle applies, not to anatomically defined motoneuron pools as originally suggested by Henneman's size principle [10], but to functionspecific motoneuron populations that can innervate multiple muscles and/or compartments within a single muscle [41][42][43][46][47][48][49][50][51][52]. In particular, the biceps brachii, used in this study, is known to have multiple anatomical neuromuscular compartments, with separate subdivisions even within the gross anatomical divide of the short and long head [53]. Biceps brachii motor unit recruitment has been shown to vary depending on the contraction levels in the flexion and/or supination directions, with motor units distributed across the biceps with no clear spatial distribution relevant to function [13,41,42,54]. Other muscles have been shown to have similar taskdependent recruitment order differences, such as in the 1st dorsal interosseus muscle when performing flexion versus abduction of the index finger and in a variety of non-multifunctional arm muscles [43,55]. In agreement with this existing body of literature, we found that biceps motor unit recruitment significantly differed between elbow-flexion and forearmsupination isometric contractions.
Participants' ability to control the selected motor units exclusive of and in combination with one another in the center-out task suggests that a minimum of three partially independent neural drives must converge to the biceps motoneuron pool. If the prevailing task-specific orderly recruitment model is correct, these neural drives should be associated with established motor primitives, such as elbow flexion and forearm supination for the biceps [42]. Therefore, learning to exclusively recruit individual motor units should be equivalent to finding the appropriate motor tasks to perform, and any increases in center-out task performance should be attributed more to learning the association between these particular tasks and the computer cursor than to improvements in performance of these well-established primitives. Radhakrishnan et al [56] studied this type of learning in a center-out task similar to that of our study, in which participants learned to control a computer cursor through various arbitrary, nonintuitive combinations of upper-limb motor tasks, e.g. through simultaneous elbow flexion and index finger abduction. These participants learned the task and achieved a high-level performance plateau within 30 min. In stark contrast to Radhakrishnan et al study, participants' performance in this study increased throughout several days of training with no significant decrease in overall learning rate over time, suggesting a different mechanism for independent motor unit control than recalling established motor primitives. Moreover, dimensionality of the selected units' firing rates in the center-out task increased throughout the 6 d of experiments, similarly signifying an activation strategy requiring refinement over time and thus not clearly related to executing stereotyped tasks. Observed combinations of firing rates across the selected units significantly expanded during the center-out task relative to the force-control task, suggesting that individual motor unit control did not solely rely on the execution of elbow flexion and forearm supination motor primitives. Aligned with this observation, participants sometimes reported relying on abstract biceps contraction strategies that they were not able to precisely describe (supplementary discussion 1). Taken together, these observations suggest that the boundaries between neuromuscular compartments may not be as strictly defined by established motor primitives as previously thought.
Our results suggest the presence of some latent flexibility in motor unit recruitment order that allows for the formation of novel motor patterns. In particular, recruitment analyses showed that neurofeedback enabled participants to expand the available control space of the selected motor units beyond the robust low-dimensional manifold spanned during combinations of forearm-supination and elbow-flexion isometric contractions. Such flexibility could rely on selective pathways that bias motor unit recruitment in neuromuscular compartments otherwise controlled by a single descending neural drive. Selective recruitment mechanisms have been previously hypothesized to account for de-ordered motoneuron recruitment under certain conditions, as for example during ballistic [57] or lengthening [17] muscle contractions or following cutaneous stimulation [58]. This selective motor unit activation has been hypothesized to arise from heterogeneously distributed excitatory input to the spinal motoneuron pool and/or through excitatory or inhibitory synaptic currents that bias pools of motor units [59]. While there has been a lack of empirical evidence suggesting that these pathways are involved during established motor behavior [60], we suggest that such mechanisms, enabled by neurofeedback, might underlie this study's observed flexibility in motor unit recruitment order. However, the presence of selective recruitment mechanisms should not be interpreted as a lack of orderly recruitment. On the contrary, the population-level increases in firing rate dimensionality during the center-out task emphasize the existence of constraints between motor units that could restrict which motor units are able to be selectively recruited, as the ability to selectively recruit every individual motor unit in the nervous system would be computationally infeasible [11]. We, therefore, propose that both these mechanisms can influence neurofeedback-enabled motor unit recruitment and that the orderly recruitment of subgroups of motor units observed during isometric contractions may not be an immutable constraint of unit activation but rather be an emergent property of motor control. These observations are particularly aligned with the parallel work of Marshall et al on flexible control of individual motor units [18], which provides compelling evidence of de-ordered motor unit activity during complex motor behaviors and suggests that descending neural drives can contribute to flexible motor unit recruitments to adapt muscle activity to task demands.
While orderly recruitment of motor units maximizes the computational efficiency of the central nervous system during the production of a stereotyped output [11], additional flexibility in motor unit recruitment can enable the neuromuscular system to cope with the wide range of movement conditions needed for everyday life [59]. Our study sheds additional light on the ongoing debate on the generalization of orderly recruitment principles and the ultimate flexibility of the sensorimotor system [60].

Potential translational implications
Our ability to skillfully control different muscles is tied to our ability to sense their state through proprioceptive organs [61]. Aligned with a motor control framework maximizing robustness and computational efficiency, these organs only provide information at the level of each muscle and not at the level of single motor units, making it impossible to naturally learn to skillfully control different motor units independently. By artificially augmenting the resolution of proprioceptive signals using neurofeedback, here we demonstrated that skilled and independent motor unit control is possible. In particular, the developed NMI enabled participants to expand the control space of individual motor units within a single muscle beyond that observed during isometric contractions typically studied in the biceps brachii. This increased number of independent DoFs per muscle could enable some applications typically reserved for BMIs. In particular, in people with cervical spinal cord injury, where in the most severe cases only the muscles innervated by cranial nerves remain functional, existing NMIs provide limited utility compared to BMIs: most muscles with residual control perform critical motor functions and would not be useful for a traditional myoelectric application in which only natural motor primitives are mapped to the controlled end-effectors. However, by instead expanding the number of controllable DoFs within a single muscle, the proposed NMI could more effectively repurpose residual muscles with dispensable functions, such as the auricular muscles [62], or even the functionally-paralyzed muscles themselves if sufficient volitional motor unit control remains [63].
The proposed NMI could also be advantageous for applications in able-bodied users. Indeed, by increasing the number of control signals that can be extracted per muscle, fewer muscles are required to operate the same number of actions, thereby potentially minimizing interference with functions needed for daily life. However, it remains to be assessed whether independent control of individual motor units can be achieved while multitasking and whether the added control dimensionality is sufficient to compensate for the increased cognitive demand and training required to operate such a NMI.
In addition to expanding the control dimensionality of individual motor units, the developed NMI enabled precise control of selected individual motor units both exclusive and in combination with other units. As illustrated in the speller task, this ability enables motor units to produce continuous, multidimensional control from a single muscle. Within the BMI domain, this form of control has been required to skillfully and intuitively operate complex devices and has enabled not only typing but also more general BMI applications, such as pointand-click navigation of a computer [64] and control of multi-DoF robotic effectors [1,2]. Since most non-invasive BMI implementations rely on discrete, task-specific control schemas due to the difficulty in non-invasively decoding continuous control signals [65], such control paradigms have historically been confined to invasive BMIs, enabling invasive BMIs to far outstrip performance over their non-invasive counterparts [2,7,45,66,67]. However, recent surveys indicated 40% of surveyed patients with tetraplegia or paraplegia would not undergo implantation even if the implant restored daily function [6,68], indicating the dire need for a class of generallyapplicable non-invasive devices that remains comparable to their invasive alternatives. Furthermore, while best-in-class EEG speller implementations have demonstrated much higher bitrates (3+ bits s −1 [69,70]) than the preliminary demonstration here (0.43 bits s −1 ), their reliance on exogenous stimulation to detect visual attention requires sustained concentration by the user and makes their performance unlikely to translate in self-paced, real-world applications. Indeed, when similar implementations are tested in ecological settings much lower performances are reported [71]. In particular, the bitrate achieved in this study over a few training sessions is comparable to the 0.31 bits s −1 throughput measured over 2.5 years of at-home, all-day use of an EEG speller [71]. Promisingly, our learning analyses indicating that motor unit control improves with more training suggests that the developed NMI might not only be more flexible than existing non-invasive BMI implementations but also may approach the throughput of these best-in-class implementations over time. An NMI enabling independent motor unit control, as the one demonstrated in this study, thus possesses promising qualities that could lead to a new class of highperformance, non-invasive translational devices.

Limitations and future directions
Stable, online detection of motor unit activity using non-invasive recording technologies remains challenging in ecological settings. The waveform of recorded motor unit action potentials and consequently their detection in surface EMG recordings are known to be sensitive to movement artifacts and to relative positioning of skin to muscle, which can be especially deleterious in anisometric conditions [72]. In our study, we overcame these limitations through physical constraints imposed by the orthosis and by instructing participants to avoid performing overt movements when trying to control the selected motor units. We confirmed these relative static recording conditions through kinematics recordings. In more dynamic settings, improved algorithms for motor unit detection may be required to increase reliability. Notably, while global EMG features are often used as a proxy for motor unit activity in non-invasive recordings, their lower information content is likely to hinder performance [29,73], as also suggested by our results showing dimensionality increases that are greater in motor unit firing rates than in iEMG. Alternatively, minimally-invasive intramuscular electrodes could enable individual motor unit recordings during anisometric contractions [74]. These advances in motor unit detection both could afford for a more practical understanding of the inherent dimensionality of a given muscle and are likely necessary for real-world use of a translational device controlled by individual motor units.
Our results suggest that neurofeedback enables an increase in the control space of individual motor units within the biceps brachii beyond that observed during isometric contractions, providing new insights on the flexibility of motor unit recruitment orders. However, it remains unclear whether the observed increase in control dimensionality represents an absolute increase in motor output dimensionality. In particular, while we were able to exclude isometric elbow flexion and forearm supination contractions as an explicit strategy to generate independent motor unit control, it is possible that participants' strategies involved subtle co-contractions of synergistic muscles to modulate recruitment, and thus the observed increase in biceps motor unit activity dimensionality may have come at the expense of the ability to control other muscles. Further studies capable of testing more ecological movement scenarios while simultaneously recording the activity of synergistic muscles are necessary to further explore the potential of the proposed NMI for motor augmentation.
The population-level dynamics across motor units observed in neurofeedback tasks suggest each dimension of the system can be driven by sets of motor units, as opposed to a single motor unit. This can increase robustness to experimental instabilities and can facilitate a finer-grained measurement of a dimension's amplitude by incorporating multiple units' firing rates. Similarly, the decoder can periodically be tuned to optimize for performance or for similarity to previously learned decoders, leveraging the fact that neural activity resides in a persistent, lowdimensional manifold [75].
Additionally, this current study did not explicitly identify nor target selection of motor units that had been selected in previous days of training. The presence of within-day learning in our study and the intracortical BMI literature [76,77] both suggest that retaining similar sets of motor units over days may increase overall performance. This can be addressed by longitudinally tracking individual motor units over training and prioritizing selection of those units [78]. Alternatively, chronically implanted intramuscular electrodes could enable recordings that stably identify motor units across days, though such a system has yet to be shown.
Finally, this study solely tested participants with no history of motor impairments, and so future studies motivated by clinical translation should be performed to determine the efficacy of a motor-unit NMI in people with sensorimotor disabilities. Promisingly, however, recent studies in people with cervical spinal cord injury demonstrated that residual activity in functionally-paralyzed muscles [63] and impaired movements [79] can be successfully harnessed for powering peripheral human-machine interfaces. Additionally, initial neurofeedback studies reported selective motor unit control in a variety of muscles, suggesting this study's results can generalize outside of the biceps brachii [23].

Conclusion
In conclusion, we have demonstrated that skilled independent control of individual motor units belonging to the same muscle can be enabled through an NMI. We uncovered properties of individual motor unit control useful for translational and augmentative applications. Concurrently, we shed light on long-standing questions surrounding the applicability of recruitment order often measured in stereotyped movements to volitional control of individual motor units. Advances in both motor control theory and neurotechnologies are critical to push the field towards more widely-applicable devices. Our study provides advances in both, potentially leading to improved therapeutics for people with sensorimotor disabilities and to a new class of neurotechnologies for human augmentation.

Code availability
Software routines used for motor unit decomposition are published in a public code repository, https://github.com/carmenalab/emgdecomp. Software routines developed for data analyses are available from the corresponding authors upon reasonable request.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.