System based on subject-specific bands to recognize pedaling motor imagery: towards a BCI for lower-limb rehabilitation

Objective. The aim of this study is to propose a recognition system of pedaling motor imagery for lower-limb rehabilitation, which uses unsupervised methods to improve the feature extraction, and consequently the class discrimination of EEG patterns. Approach. After applying a spectrogram based on short-time Fourier transform (SSTFT), both sparseness constraints and total power are used on the time-frequency representation to automatically locate the subject-specific bands that pack the highest power during pedaling motor imagery. The output frequency bands are employed in the recognition system to automatically adjust the cut-off frequency of a low-pass filter (Butterworth, 2nd order). Riemannian geometry is also used to extract spatial features, which are further analyzed through a fast version of neighborhood component analysis to increase the class separability. Main results. For ten healthy subjects, our recognition system based on subject-specific bands achieved mean accuracy of and mean Kappa of . Significance. Our approach can be used to obtain a low-cost robotic rehabilitation system based on motorized pedal, as pedaling exercises have shown great potential for improving the muscular performance of post-stroke survivors.


Introduction
Hemiparesis is a widespread condition, mainly caused by hemorrhagic or ischemic stroke, which consists of a partial loss of motor function on one side of the body contralateral to the brain hemisphere where the lesion occurs [1]. Stroke is a serious neurological disease that continues to increase among younger adults [2]. Post-stroke patients may present a partial or severe neural injury, affecting their motor functions on upper and lower limbs [2,3]. As stroke is associated with gait disturbance [3], several robotic devices, such as robotic exoskeletons and smart walkers have been proposed to assist these patients during the therapy process while walking [4][5][6][7]. Regarding exoskeletons, there are several commercial geometry, neighborhood component analysis, sparseness constraints, time-frequency analysis (Some figures may appear in colour only in the online journal) Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. systems for gait rehabilitation, such as the hybrid assistive limb (HAL) [7][8][9][10], ReWalk [11], Lokomat [12,13], and Lopes [14]. In contrast to physical therapists, these robotic devices are able to perform a high number of repetitive and heavy tasks required for rehabilitation, as they are able to generate mechanical power [7,15,16]. Thus, through these systems, inaccurate movements due to therapist fatigue may be avoided, reducing undesirable accidents or slow recovery. Moreover, robotic exoskeletons may be adjusted to accommodate patient progress during motor recovery, and sensory feedback may be added to the robotic system to provide cognitive and physical interaction between user and exoskeleton [6].
Although robotic exoskeletons have shown effectiveness for gait rehabilitation [7,16], some issues such as weight, complexity, and high manufacturing cost still limit their extension to low-and middle-income developing countries. One way to decrease the cost of these rehabilitation systems is to use an active orthosis together with a walker (conventional [17] or smart [5,18]). This combination can be used to enhance the balance of stroke patients during walking rehabilitation.
On the other hand, pedaling rehabilitation equipment is relatively inexpensive and has shown effectiveness in both gait and lower-limb rehabilitation [19][20][21][22][23][24]. As such, it shows great potential to be more easily extended into developing countries.
As a result, a variety of research based on pedaling exercises has been carried out for post-stroke patients [1,[25][26][27][28][29][30], and some other studies on electroencephalography (EEG) signals have been conducted during pedaling [19,31,32]. In [19], a study was carried out to determine the effect of reciprocal pedaling exercise on the gait and cortical reorganization in post-stroke patients, finding neuroplasticity evidence and improvements in the function of ambulation.
Post-stroke patients with severe motor disabilities may need assistance to voluntarily initiate a movement of an equipment for their rehabilitation. For this reason, some brain-computer interfaces (BCIs) have been proposed to recognize pedaling motor imagery [33,34]. BCIs are communication systems that measure central nervous system (CNS) activity and convert it into artificial output that replaces, restores, enhances, supplements, or improves natural CNS output and thereby changes the ongoing interactions between the CNS and its external or internal environments [35,36]. As a result, BCIs based on EEG have been proposed to control end-applications [36], such as cursors [37,38], televisions [39], wheelchairs [40,41], robotic prostheses [42] and exoskeletons [43,44], and also motorized pedals [33,34]. Event-related desynchronization/synchronization (ERD/ ERS) and motor related cortical potentials (MRCPs) have been used to recognize motor intention on EEG, providing a closed-loop control between both user and robotic device [33,43,45]. However, robotic systems using both motorized pedals and BCIs based on pedaling motor imagery have not been fully explored [33,34].
The aim of this study is to propose a pattern recognition system of pedaling motor imagery for lower-limb rehabilitation purposes, which uses unsupervised methods to improve the feature extraction, and consequently the class discrimination of EEG patterns. Two alternatives for EEG pre-processing were explored: a) a spectrogram applying short-time Fourier transform and sparseness [46] is used to choose the subject-specific band per EEG channel during the calibration stage, which is subsequently used for filtering; b) filtering using fixed frequency bands. After pre-processing, Riemannian geometry is used for spatial feature extraction [47], which is combined with fast neighborhood comp onent analysis for feature selection [48], in order to improve the class discrimination. Our approach may reduce the computational cost of the BCI in comparison with other studies [33,34], as the Riemannian method provides a projection matrix that can be applied on-line to obtain the feature vector for recognition [47].
This paper is structured into four sections. Section 2 firstly describes the experimental protocol, followed by the proposed system to recognize pedaling motor imagery, and the methodology used to evaluate our approach. Afterwards, the results are presented in section 3, in which the performance of our recognition system is analyzed. Finally, discussion, limitations, and conclusions about the proposed method are described in Sections 4-6, respectively.

Protocol
An experimental protocol was conducted on ten voluntary subjects (three females and seven males from 21 to 36 years old) to evaluate the proposed system during pedaling motor imagery recognition. For this study, previously approved by the Ethics Committee of UFES/Brazil (Protocol number: 64797816.7.0000.5542), participants without lower-limb injury or locomotion deficits were selected. The experiment carried the following four stages: (1) fulfill the ethics form; (2) protocol familiarization; (3) electrode preparation; and (4) protocol execution.
First, the subjects were seated on a comfortable chair looking at a 19 inch display, as shown in figure 1(b). All subjects (S01 to S10) provided their informed consent prior to the data collection. Second, the background of this study was explained. In this stage, the subjects were asked to mentally pedal on a monocycle, thinking of a real situation, such as going on the beach or home, among others. Once the subjects had located the best way to execute the pedaling motor imagery, they were asked to mentally pedal on the motorized pedal. Figure 1(c) shows the motorized pedal used in this study. Then, in order for the subjects to train for the real situation of our protocol avoiding mental fatigue, they were asked to execute the pedaling motor imagery, tracking four cues on the display with period of 5 s. Figure 1(d) shows the cue sequences used to guide the subjects throughout the protocol. A graphical user interface was developed to present the cues on the display, using an Arduino board to generate the periods of duration of each cue through a synchronous signal. During the protocol, the subjects were asked to avoid any movements, although this is physiologically unavoidable. Thus, the subjects were asked to execute unavoidable movements, such as blinking during the black screen (see figure 1(d)) preferably. Simultaneously to the protocol familiarization, the electrode preparation was carried out.
Finally, the protocol was conducted in two-stages, in order to obtain an EEG dataset for calibration and validation, which are formed by six sessions, each composed of twelve repetitions. Figure 1(d) shows repetitions of 20 s in length, which are described by the following sequence: (1) black screen, (2) red (Resting), (3) yellow (Waiting), and (4) green (Execution). The cues red, yellow and green indicate the 'rest state', 'ready' (also rest state but waiting for the next cue without mental activity), and 'execution of pedaling motor imagery for 5 s', respectively. Figure 2 shows the block diagram of the proposed system based on subject-specific bands to recognize pedaling motor imagery, which is formed by the following two stages: Calibration and Validation.

Recognition system
The Calibration stage is used for the training set to locate the subject-specific band of each EEG channel, such as done in [49][50][51], analyzing both the power and sparsity on the timefrequency representation [46,52], as shown in figure 2 (first block). These subject-specific bands are used in a second stage for filtering the raw EEG. Then, the best features are selected after applying the Riemannian geometry on the filtered EEG for feature extraction, which are further used to obtain a model for classification through the linear discriminant analysis, as shown in the second block of figure 2.
Finally, the Validation stage is used to recognize new inputs on-line as a real-time closed-loop system. More details on the proposed system are given below. Then, EEG signals were acquired in a frequency range from 0.1 to 100 Hz over the primary and supplementary motor cortex [49], using the following nineteen locations: Fz, FC3, FC1, FCz, FC2, FC4, C5, C3, C1, Cz, C2, C4, C6, CP3, CP1, CPz, CP2, CP4, and Pz. Figure 1(a) highlights with red circles the nineteen EEG electrodes, which were selected in this study from a cap with 64 electrodes (Ag-AgCl), according to the international 10-20 system. Moreover, the reference electrodes were located on earlobes A1 and A2, and the ground electrode was located between the eyebrows. These electrode locations have been previously used in other research to study motor intention of lower-limbs [34,43,45,[53][54][55], as the brain activity linked to real or imagery motor actions are mainly generated on the supplementary, primary motor, primary somatosensory (S1) and pre-motor areas [34,49]. During the electrode preparation, gel was used to improve the skin impedance.
Additionally, two channels of sEMG were placed on both lateral gastrocnemius muscles, and captured in a frequency range from 10 to 100 Hz. These sEMG signals were used here as a reference to ensure no muscular contractions, during the annotation of time periods related to pedaling motor imagery. Another channel was used to capture the synchronous signal generated from the Arduino board, which was used as a reference on the raw EEG to locate the beginning of each cue presented in figure 1(d).

Data selection.
Data selection to form the dataset of motor imagery tasks is a challenge, as reference sensors such as accelerometers and goniometers are not able to accurately mark on the raw EEG when the task occurs. Here, the raw EEG collected throughout the black screen cue was excluded from analysis, as all subjects were asked to execute unavoidable movements such as blinking during this cue. Therefore, undesirable artifacts may affect the signal processing over this time interval. Similarly, the data collected during the yellow ('Waiting') cue was also discarded for analysis, in order to avoid effects related to motor planning (or preparation). In fact, the subjects can unconsciously re-train the motor execution before some cues, which is not trivial to control.
Then, the following two classes 'Resting' and 'Pedaling' were considered to form two balanced sets for training and validation. This way, epochs that correspond to both 'Resting' and 'Pedaling' were selected on the raw EEG for a period of 4.5 s throughout the cue. The first 500 ms after the Pedaling cue was not analyzed here, avoiding any influence of evoked potentials [56,57]. It is worth noting that data collected during the first six sessions were used to obtain the training set, while data collected on the last six sessions were used for validation.

Pre-processing.
Time-frequency representation. The spectrogram based on short-time Fourier transform (SSTFT) has been widely used in BCIs for motor imagery recognition [52,58], as this method can be used to represent non-stationary signals into the time-frequency domain, obtaining low-computational cost, and an acceptable time and frequency resolution below 20 Hz [58]. Therefore, it is useful to obtain features from both µ (8-12 Hz) and β (18-24 Hz) rhythms, which play an important role during motor imagery recognition [33,34,51,52,58].
In our approach, SSTFT was used on the training set to locate the subject-specific band per channel, as shown in figure 2 (and explaining in subsection Selection of subject-specific bands). SSTFT performance may be affected in accordance to the following arrangement: window function, length and overlapping of the window function, and the number of fast Fourier transform (NFFT) [59]. For this reason, several window functions, such as Gaussian, Hamming, Hanning, Barthann, Bartlett, Taylor, Blackman, Bohman, rectangular, and triangular were evaluated here on SSTFT, applying different window sizes and NFFT, with 50% overlapped windows. The size of the window function and the NFFT value can be computed as follows: where T is the period of time of each epoch 'Pedaling' or 'Resting', L is the length of the window function, N p is the number of small segments obtained from each epoch, and NFFT is the number of fast Fourier transform (FFT).
Here, the SSTFT can be computed on the EEG signal as follows: where x is the EEG signal, w is the windowing function of length L, x(n + m)w(m) is a short-time segment of the EEG (1) Stage to obtain the subject-specific band for each EEG channel; (2) stage for filtering, and to obtain the best spatial features, and the classification model. signal x at time n, k is a frequency index, N is the FFT length, and |X(n, k)| 2 is the SSTFT. Large L value gives better frequency resolution, while small L gives less temporal averaging. Usually, N L and larger N provides more frequency domain samples of the discrete time Fourier transform, achieving better location and amplitude of peaks.
The following subsection explains the application of SSTFT on the raw EEG to automatically locate subject-specific bands that present a high contribution during pedaling motor imagery.
Automatic selection of subject-specific bands. The concept of 'sparse coding' refers to a representational scheme where only a few units (out of a large population) are effectively used to represent typical data vectors [46]. In fact, this implies most units taking values close to zero while only few take significantly non-zero values [46]. As a novelty, both sparseness constraint and total power are proposed here to automatically locate on SSTFT the subject-specific bands per EEG channel, which present the highest total power for several trials of pedaling motor imagery. Below are given details of our approach.
Let T = {e 1 , e 2 , · · · , e i , · · · , e N } a training set formed by N trials (or epochs) of the classes resting (c 1 ) and pedaling (c 2 ), where e i ∈ R E×n is the raw EEG, E is the number of EEG channels, and n is the total of samples.
First, the time-frequency representation for each EEG t,f ∈ R T×F (time and frequency indices run over t = 1, 2, · · · , T and f = 1, 2, · · · , F , respectively) is computed on e i through SSTFT (see equation (3)), in order to construct a matrix P ∈ R NT×FE that represents a time-frequency profile over trials [52], as shown in figure 2. Here, P i ∈ R T×FE was constructed over the frequency range from 0 to 40 Hz, which agrees with previous research that employed a frequency range from 4 to 40 Hz [60,61]. In literature, it is reported that the main power spectrum linked to motor imagery tasks may be located on δ (0.1-1 Hz) [33,34], µ (8-12 Hz), and β (18-30 Hz) bands [49,52]. Moreover, a high contribution on the band δ during pedaling motor imagery may be obtained, as described in [33,34].
Second, each column of P is analyzed using sparseness constraints, which are defined by equation (4). Notice that the sparseness is computed for the j th element (frequency component) for each trial i. Also note that the data contributed by the first trial (i = 1) corresponds to the first T row vectors of P.
where ξ is a vector that contains the sparseness constraints obtained for all frequency components j , P is the time-frequency profile constructed from all trials, N is the total of trials, T is the duration of each trial i, and t is the sample of each trial. Then, the mean sparseness is calculated using equation (5).
where ξ is the mean sparseness. The total power and mean power of P can be computed as follows: where φ(P j ) is the the total power, and φ is the mean power. Third, both sparseness constraints and total power are used to select for all EEG channels, those candidate frequencies that pack the highest power. In [46], those candidate vectors that present power value greater than the average power, and sparseness greater than 70% of the mean sparseness were selected. Here, the candidate vectors (j th elements) of P are selected if the data vector j has a power φ(P j ) greater than φ × λ 1, as well as a sparseness ξ(P j ) greater than ξ × λ 2, are empirically selected parameters for regularization. It is worth noting the pair (λ 1 = 1, λ 2 = 0.7) agrees with [46], however, our approach is adaptive to the intrinsic characteristic of each time-frequency profile P.
Then, for each EEG channel and m pair (λ 1 , λ 2 ) the frequency ranges (formed by those selected candidate vectors that present j th elements arranged sequentially) are obtained, holding the frequency range that packs the maximum total power during pedaling motor imagery, as shown in equation (8).
where S E,m is the total power achieved taking into account the EEG channel E and the pair m (λ 1 , λ 2 ), f L and f H are the retained lower and upper frequencies of each bandwidth that present the maximum total power, and h is an index that corresponds to the selected candidate vectors, respectively.
A variety of research has demonstrated that BCIs based on subject-specific bands may improve the motor intention recognition [50,51]. Thus, the location of the smallest frequency bands with significant power contribution may improve the BCI performance [50,51]. For this reason, the Friedman test with α = 0.05 is used here to compare the columns of S, in order to choose the column that presents the lowest value of power band without significant difference with respect to the group of higher power bands. Thus, for all EEG channels their subject-specific bands that result when applying equation (8) for this selected column are finally obtained. Filtering. A low-pass filter of zero-phase shift forward/ backward (Butterworth, 2nd order) is employed for the EEG pre-processing, fixing its cut-off frequency according to the subject-specific bands obtained from the training set.

Feature extraction by Riemannian geometry. The
Riemannian method [47] is used in both Calibration and Validation stages of our approach to compute the spatial features, after filtering the raw EEG, as shown in figure 2. Let e i ∈ R E×n be a trial i of raw EEG, where E is the number of EEG channels, and n is the total of samples.
Then, the training set T = {e 1 , e 2 , · · · , e i , · · · , e N }, formed by N trials of the classes resting (c 1 ) and pedaling (c 2 ), is firstly used in the Calibration stage after filtering to compute a reference symmetric and positive definite (SPD) matrix, denoted as C ref . C ref is a free parameter that defines the point in the manifold M where the tangent plane is computed [47]. Here, C ref is the average of the whole set of labeled covariance matrices, which can be computed as follows: where X p ∈ R E×T is a filtered trial that corresponds to the class y p ∈ {c 1 , c 2 }, p is the number of labels, E is the number of EEG channels, C p ∈ R E×E is the sample covariance matrix (SCM), and n is the epoch duration in number of samples. All covariance matrices {C p } P p=1 can be projected onto the tangent plane by equation (11), while the inverse operation that projects an element of the tangent space back to the manifold, namely the exponential map, is defined by equation (12) [47].
where S p is the tangent vector, logm(·) denotes the logarithm of a matrix [62], and expm(·) denotes the exponential of a matrix. Moreover, the Riemannian distance between both {C p } P p=1 and C ref , and the geometric mean can be computed by equations (13) and (14), respectively. where The geometric mean is computed by an iterative procedure consisting of projecting the covariance matrices in the tangent space, estimating the arithmetic mean in the tangent space and projecting the arithmetic mean back in the manifold [47]. Then iterate the three above steps until convergence. The full algorithm is given by the algorithm 1 [47]: Finally, the projection matrix C ref obtained is used in both Calibration and Validation stages, to project the covariance matrices {C p } P p=1 in the tangent space. These covariance matrices projected onto the tangent plane can be used as a feature in a classifier, considering the following modified halfvectorization operator that stacks, with appropriate weighting, the upper triangular part of S p into a (E + 1)E/2 × 1 column vector, as shown in equation (15).
Below, the Riemannian pseudo-algorithm is presented, which uses the functions covariances, meancovariances, and Tangentspace, available at the following website https:// github.com/alexandrebarachant). E is the number of channels, n is the number of EEG samples per channel, i is the number of each trial, and T denotes the transpose operator. The feature vectors are analyzed on the training set to select those features that minimize the classification error. This way, the fast neighborhood component analysis (FNCA) method was used to learn the feature weights through a regularization process [48,63]. y 1 ), (x 2 , y 2 ), . . . , (x i , y i ), . . . , (x N , y N )} be the training set, where N is the total of samples, and x i is a d-dimensional feature vector with class label y i ∈ {1, 2, . . . , C}. The Mahalanobis distance d between the points x i and x j is given by the following equation [63]: where W is the transformation matrix. If W is a diagonal matrix, the equation (16) can be presented as follows [48]: where w l is a weight associated with the lth feature. Specifically, each point x i selects another point x j as neighbor with probability p ij . Then, a differentiable cost function is used, which is based on the stochastic ('soft') neighbor assignment in the transformed space, as shown in equation (21).
where p ij is the probability of x i selecting x j as its nearest neighbor, p i is the probability that the point x i will be correctly recognized, y ij = 1 for y i = y j and y ij = 0 otherwise, λ is a regularization parameter that can be fitted using crossvalidation, and σ is the width of the probability distribution. Let κ(z) = exp(z/σ) be a kernel function, with width σ. If σ → 0, only the nearest neighbor of the query sample can be selected as its reference point, while σ → ∞ when all points have the same chance of being selected apart from the query point. More details of FNCA can be consulted in [48].
2.2.6. Classification. Linear discriminant analysis was used to discriminate both resting and pedaling classes. This classifier has shown promising results in previous research using BCIs based on EEG to recognize real or imagery motor tasks [64].

Statistical evaluation
The aim of this section is to evaluate the proposed pattern recognition system, using different combinations for pre-processing the raw EEG. This way, both training and validation sets were used. Firstly, several processing windows (0.5 s, 1.0 s, 1.5 s, 2.0 s, 3.0 s, and 4.0 s) and sliding time (20 ms, 30 ms, 50 ms, 100 ms, 200 ms, and 250 ms) were evaluated throughout segments of 4.5 s that correspond to both 'Resting' and 'Pedaling', applying a low-pass filter (Butterworth, 2nd order, and cutoff frequency of 30 Hz) into the pre-processing stage of the recognition system, in order to obtain a suitable widow size and overlapping for EEG processing during pedaling motor imagery recognition. The best window processing and overlapping achieved were used for the next evaluation.
The effect of the sensory stimulation on the system performance was also studied, using a low-pass filter at 30 Hz over epochs related to both Pedaling (from 0.5 to 5.0 s after the green cue beginning) and Resting (considering segments of 4.5 s with sliding 0.1 s from the red cue beginning).
Moreover, both training and validation sets were used to compute the power changes over Cz location during pedaling motor imagery, applying the same low-pass filter at 30 Hz. The power changes were expressed as the relative power change in dB with respect to the baseline rest condition. The relative power was assessed on each filtered epoch, calculating the FFT. Subsequently, spectra were averaged over segments.
Also, the best window processing and overlapping were used later on both training and validation sets to compare the performance of the proposed recognition system using three different setups for pre-processing: (1) a filter with a frequency range from 8 to 30 Hz (typically used in other studies [47,65,66]), (2) a filter with a frequency range from DC to 30 Hz, and (3) a filter based on subject-specific bands obtained from SSTFT. Butterworth filters of 2nd order were used here, and different setups (window function, length of the window function (L), overlapping 50%, and number of FFT (NFFT)) for SSTFT to obtain the subject-specific bands were also evaluated.
The metrics accuracy (ACC), Kappa, true positive rate (TPR), and false positive rate (FPR) [67] were used to evaluate the performance of the proposed system. Both TPR and FPR were calculated for the pedaling class using equations (22) and (23).
where TPR is the true positive rate, FPR is the false positive rate, TP is the total number of the 'Pedaling' class classified correctly, FN is the total number of the 'Pedaling' class classified incorrectly as 'Resting', TN is the total number of the 'Resting' class classified correctly, and FP is the total number of the 'Resting' class classified incorrectly as 'Pedaling'. In addition, both non-parametric methods Friedman test and Wilcoxon signed rank test were used to obtain a statistical comparison ( p < 0.05) between the analyzed recognition systems.

Results
The results show the performance of our recognition system using one of the following three combinations into the preprocessing stage: (1) SSTFT to firstly select subject-specific bands for filtering, (2) Butterworth filter (2nd order) with a frequency range from 8 to 30 Hz (typically used in other studies [47,65,66]), and (3) Butterworth filter (2nd order) with a frequency range from DC to 30 Hz. Also, the performance of the proposed system using several setups for SSTFT (window function, length of the window function (L), and number of FFT (NFFT)) into the pre-processing stage are presented.
As a first step, several window sizes (from 0.25 s to 1.5 s) for EEG processing were evaluated in our system throughout segments of 4.5 s, using different overlapping time (from 20 ms to 250 ms), in order to obtain a suitable combination (window size and overlapping time) for pedaling motor imagery recognition. This way, segments from both Pedaling (from 0.5 to 5.0 s after the green cue beginning) and Resting (from 0 to 4.5 s after the red cue beginning), and a low-pass filter (Butterworth, 2nd order, cut-off frequency at 30 Hz) into the pre-processing was used. Figure 3(a) shows for all subjects, the mean Kappa value achieved by our recognition system using different window sizes and overlapping time for processing. It is possible to observe that our system achieved its best performance using processing windows of 0.5 s with overlapping of 20 ms, which decreased increasing the overlapping time. In general, mean Kappa values higher than 80% were obtained for processing windows of 0.5 s to 2.0 s, overlapped up to 50 ms, as shown in figures 3(a). Moreover, figures 3(a)-(e) show that smallest processing windows as 500 ms, overlapped by 20 ms may be used for pedaling motor recognition, without significant difference with respect other setups, such as processing windows of 1.0 s with overlapping 30 ms. Thus, processing windows of 500 ms, overlapped by 20 ms were adopted here for other analysis and comparison.
As a second stage, the system performance throughout segments of 4.5 s from both 'Pedaling' (0.5 s to 5.0 s after the green cue) and 'Resting' (0 to 5.0 s after the red cue) was analyzed, taking different reference time (from 0 to 0.5 s) with respect to the red cue beginning. Figure 3(f) shows that segments of 4.5 s from the red ('Resting') cue beginning may be used to increase (ACC > 95%, Kappa > 90%, TPR > 95%, andFPR < 3%) the pedaling motor imagery discrimination, which agrees with [57]. It is described that visual evoked potentials occur into the first several hundred milliseconds (up to 500 ms) after a visual stimulus [56]. In [57], the authors suggest that the cue can induce a short-lived brain state as early as about 300 ms after cue onset that is an automatic process and probably unconscious, which could be relevant to a motor imagery-based BCI. As a result, we adopted for next analyses, segments of 4.5 s after cues from 0 to 4.5 s for 'Resting', and between 0.5 s and 5.0 s for 'Pedaling'.
Figure 4(f) presents the mean relative power between both 'Resting' and 'Pedaling', which was computed for ten subjects applying FFT over window processing of 500 ms, overlapped by 20 ms throughout segments of 4.5 s. The largest difference was achieved at 20 Hz, followed by other lowest frequencies such as 2 Hz and 10 Hz, which agrees with [32]. As a third stage, the performance of our system using only a processing window of 500 ms, overlapped by 20 ms was analyzed, applying into the pre-processing stage the following three strategies: filtering using subject-specific bands, filtering from DC to 30 Hz, and filtering from 8 to 30 Hz. Here, N p values from 4.5 to 8.0 were used applying SSTFT for different window functions to obtain the subject-specific bands.  1 show that the recognition system-based on band-pass filters from 8 to 30 Hz significantly ( p < 0.05) decreased its performance (mean Kappa of 65.69%) during pedaling motor imagery recognition. Furthermore, figure 4(a) shows that the system applying SSTFT with the same window function, presented different performance for several N p values. In fact, it is easy to observe Pedaling motor imagery recognition using three different setups into the pre-processing stage of the proposed system, such as a band pass filter with a frequency range from 8 to 30 Hz, a low-pass filter with cut-off frequency at 30 Hz, and a filter based on subject-specific bands obtained from SSTFT (applying several window functions with different length). Here, processing windows of 0.5 s, overlapped by 20 ms were used over segments of 4.5 s, which were selected at 0 s and 0.5 s after both Resting and Pedaling cues, respectively. (a) Mean Kappa achieved; (b) statistical multi-comparison analyzing the mean Kappa value achieved by the recognition system applying different strategies for pre-processing; (c) paired comparison between pre-processing strategies that do not present significant difference in figure (b); (d) statistical analysis through the Kappa value achieved by the system based on SSTFT, using different length for the window functions; (e) paired comparison through the Kappa value achieved by the system based on SSTFT, using different lengths for the window functions; (f) average changes of the spectral power during pedaling motor imagery relative to the baseline rest period. Table 1. Mean Kappa achieved for ten healthy subjects by the proposed recognition system, during pedaling imagery recognition, applying different setups into the pre-processing stage, and a processing window of 0.5 s, overlapped by 20 ms.  that SSTFT based on a Rectangular window achieved the lowest performance for N p from 5.5 to 8.0. A comparison between SSTFT using different window functions into the pre-processing stage was carried out, taking into account the mean performance for all N p , as shown in figures 4(b) and (c). Figure 4(b) shows that our system applying SSTFT with both Taylor and Rectangular functions, decreased significantly ( p < 0.05) the class discrimination with respect to the lowpass filter with cut-off frequency at 30 Hz. Also, figure 4(c) shows that the proposed system using a low-pass filter at 30 Hz, significantly improved the pedaling imagery recognition with respect to the global performance using filters based on subject-specific bands. It is worth noting that both Barthann and Blackman windows presented a global performance (mean Kappa of 92.82% and 92.77%, respectively) close to the low-pass filter at 30 Hz, as shown in figures 4(a)-(c) and table 1. In general, for N p = 6.0, the best performance for SSTFT was achieved using different window functions, as shown in figures 4(d) and (e). As a result, the recognition system achieved its highest performance, using SSTFT based on Hamming, Barthann, Blackman, Bohman and Triangular functions for all N p . Figures 5(a)-(d) show the performance reached by the recognition system applying these five methods for ten subjects. However, table 1 suggests that N p = 4.5 may be also suitable, applying different window functions into SSTFT (Kappa mean of 92.68%). For this reason, SSTFT based on both Barthann function (with mean Kappa of 92.80% for all N p ) and N p = 4.5 was adopted here for other comparisons. Table 2 shows for all subjects a summary of the system performance using into the pre-processing stage one of the following two setups: (1) Filters based on subject-specific bands computed from the SSTFT based on both Barthann function and N p = 4.5; (2) Low-pass filter with cut-off frequency at 30 Hz. Also, table 3 shows the subject-specific bands after applying SSTFT with both Barthann function and N p = 4.5, for all subjects. In general, the proposed recognition system using subject-specific bands (from DC to 24 Hz as average), reached an acceptable performance (mean ACC 96.43%, Kappa 92.85%, TPR 95.03% and FPR 2.18) during pedaling imagery recognition, which was close to the system using low-pass filters at 30 Hz (mean ACC 96.48%, Kappa 92.96%, TPR 95.66% and FPR 2.70). As a second step, the performance of the recognition system applying five different methods (SSTFT using N p = 6.0 with Gaussian, Hamming and Barthann windows, bandpass filter from DC to 30 Hz, and bandpass filter from 8 to 30 Hz) into the pre-processing stage was analyzed. In this case, processing windows of 0.5 s were used on the raw EEG, taking into account different time of offset (from 0 s to 1 s, with increment of 0.1 s) with respect to the beginning of the 'Pedaling' cue.

Discussion
Stroke patients can usually perform some residual movements, thus, techniques such as constraint-induced movement therapy (CIMT) [68] or bilateral arm training [69] may be used in their therapies, which have demonstrated to be useful to improve their motor function. However, these techniques are not applicable to repair severe limb weakness in stroke patients, as residual active movements are necessary for CIMT [70]. For this patient population, BCIs and brainmachine interfaces (BMIs) may play a crucial role, as severely weakened stroke patients are still able to imagine movements of the paretic limb, and they can attempt to move it even in the absence of actual movements [71][72][73][74]. In fact, these imagery and intent-to-move strategies have been reported to be useful in patients with mild to moderate motor deficits [75]. This way, pedaling motor imagery may be used to start pedaling through a BCI during lower-limb rehabilitation.
There are many clinical studies with stroke patients using BCIs for upper and lower limb rehabilitation [76][77][78][79][80][81]. In [76], thirty-two chronic stroke patients with severe hand weakness were trained on-line through a BCI rewarding desynchronization of ipsilesional oscillatory sensorimotor rhythms (SMR) with movements of hand and arm orthoses. This interaction in upper-limb rehabilitation significantly improved the Fugl-Meyer motor assessment (FMMA) score [76]. For this purpose, the authors used the SMR power computed from the ipsilesional electrodes to translate it into movement of the orthosis, taking into account a threshold value that is calculated through the mean of the power distribution during both rest and motor intention states. Then, when the SMR power is continuously in the classification area related to motor intention or resting for 200 ms, the orthosis begins to move or it stops, respectively. In [77], motor imagery of the most affected hand was used by eight severe chronic stroke patients to desynchronize their contralateral oscillatory sensorimotor rhythms during screening, and controlling an electrical stimulator (FES) in a second stage. All screened patients presented desynchronization, and a significant post-treatment improvement was detected in the primary outcome measure, as well as for almost all secondary outcome scores. The BCI developed by these authors was based on ERD analysis, considering both mu and beta frequency bands to determine the most discriminative frequency by applying offline topography maps of r 2 , in which the spatial distribution of ERD in the cortical region related to the MI of the paretic upper-limb (C3 or C4) was best and with the highest value of r 2 . As a result, the BCI−FES system presented mean values of ACC = 92.7% and TPR= 85.4% when it was evaluated in a stroke patient during two sessions of use in which the BCI was disabled during the resting trials (to avoid false FES activation to patient). Another BCI was proposed in [78] to recognize the following three states: motor relaxation, imagery of left-hand opening, and imagery of right-hand opening. For this, the raw EEG was first filtered by a bandpass FIR filter from 5 to 30 Hz (of order 101) and 50 Hz IIR notch Chebyshev type I filter (of order 6). Then, the filtered EEG is processed by the Bayesian classifier based on EEG covariance matrices to recognize three mental tasks. This BCI approach was used in a study that was carried out with 74 post-stroke patients with severe upper-limb Table 2. Performance of the proposed system using both SSTFT (with a Barthann window and N p = 4.5) and low-pass filter at 30 Hz into the pre-processing stage to recognize pedaling motor imagery. Processing windows of 0.5 s, overlapped by 20 ms were used over segments of 4.5 s. paralysis [78], analyzed in the following two groups: (1) 55 patients were asked to perform motor imagery opening their affected hand, for commanding a hand exoskeleton through a BCI to assist opening movements of the affected hand; (2) Opening movements of the affected hand of 19 patients were assisted by the hand exoskeleton without EEG. As a highlight, the BCI group showed an improvement in the action research arm test (ARAT) grasp score. In general, 21.8% and 36.4% of the patients in the BCI group improved their ARAT and FMMA scores, respectively. In [80], the authors showed that a BCI coupled with functional electrical stimulation (FES) can be used by chronic stroke survivors to increase their motor recovery, and functional connectivity between motor areas in the affected hemisphere. For this proposed BCI, each EEG channel was spatially filtered with the well-known Laplacian derivation to reduce spatial noise and aid in identification of sources [82]. Then, the power spectra density (PSD) of each spatially filtered EEG channel was computed every 62.5 ms throughout the active period for a frequency range from 4 to 40 Hz with 2 Hz resolution, using the Welch periodogram based on Hanning windows of 500 ms, overlapped by 75%. During training, the most discriminant EEG spatio-spectral features between resting and movement attempts were selected by the authors to build a Gaussian classifier, using a canonical variate analysis-based method to rank all candidate PSD features (up to ten candidate features were manually selected) in terms of discriminant power. As a result, selected discriminant EEG features for patients were localized in the ipsi-and contralesional sensorimotor areas, in frequency bands normally associated with voluntary movements. As a highlight, individual BCIs were built with the data of the offline calibration session, achieving average offline performance across BCI patients of TPR = 94.6 ± 3.4% and FPR = 3.0 ± 3.9%. In [81], a BCI based on imaginary dorsiflexion movements driving a motorized ankle-foot orthosis was proposed for stroke rehabilitation, showing its efficacy in inducing cortical neuroplasticity in nine healthy subjects with a short intervention procedure. This system detects imaginary dorsiflexion movements within a short latency from scalp EEG through the analysis of movement-related cortical potentials (MRCPs), using the locality preserving projection (LPP) algorithm to project the data along the directions that preserve the neighborhood structure, and LDA classifier. Before applying LPP and LDA, a bandpass filter from 0.05 Hz to 3 Hz was applied on Cz to increase its signal-to-noise ratio, followed by a large Laplacian spatial filter centered at Cz. Here, only the EEG data between 1.5 s before and 0.5 s after the movement onset was analyzed. This BCI achieved TPR of 73.0% ± 10.3% during on-line motor imagery recognition. BCI systems frequently use signal processing methods, such as spatial filtering, to enhance performance [80,82]. A comparison between both nearest-neighbor and next-nearest Laplacian [83] was carried out in [82], where nearest-neighbor Laplacian produced signals that were more orthogonal while the nextnearest Laplacian produced signals that resulted in better accuracy. It is worth mentioning that better prediction of user's intent produces increased speed and accuracy of communication and control. Furthermore, the signal identification is important to avoid the possibility of control by artifacts. However, there are powerful tools, such as common spatial patterns and Riemannian geometry, for spatial filtering with low-computational cost, which have shown promising results [47,66]. Robotic systems based on BCIs of low-computational cost and fast output responses (very short delay around several hundred ms) during motor intention recognition are suitable to provide an effective neuro-rehabilitation for post-stroke patients, as the resulting control can be perceived as a closed loop [81]. Addressing this aim, we proposed a recognition system based on low-pass filters of 2nd order for pre-processing, Riemannian geometry for feature extraction [47], FNCA for feature selection [48], and the fast LDA classifer [64] for pedaling motor imagery recognition, which achieved ACC > 85%, Kappa > 71%, and FPR < 11%, using a frequency range from DC to 30 Hz, and subject-specific bands. It is worth noting that the accuracy directly affects the speed response of the BCI [35]. Although previous works have reported promising results using BCIs based on bandpass filter of Butterworth from 8 to 30 Hz for motor imagery recognition [47,66], low performance (average Kappa = 65.42%) was achieved by our recognition system applying this frequency range. These results agree with [84], where the authors found that the BCI may improve pedaling imagery recognition using the δ band. Similarly, they combined spectra features from δ, α and β bands to recognize pedaling motor imagery [33,34]. In [33], the authors proposed a BCI based on the spatial filter common average reference, spectra features, and EEG channel selection for pedaling imagery recognition, obtaining for five healthy subjects, accuracy between 78 and 89% during offline processing, employing windows of 4 s. However, the best BCI performance for each subject was obtained with different setups of electrode locations, and different methods for feature extraction. Likewise, other research studying lowerlimbs has demonstrated, through BCIs based on EEG, the relevance of low frequencies for motor intention recognition [53][54][55]85].
Similar to aforementioned studies [33,76,77,80], we applied spectra analysis to obtain subject-specific bands linked to pedaling motor imagery, but it is only used in the training stage, avoiding the high computational cost using our recognition system on-line. As a result, our system based on subject-specific bands presented good performance (ACC 87.27%, Kappa 74.54%, TPR 75.77% and FPR 6.94%), introducing the sparseness constraints in combination with the total power to automatically locate on the time-frequency representation (using SSTFT based on a Barthann window), those frequency bands that packed the highest power during pedaling motor imagery. It is worth mentioning that this approach was only used here applying low-pass filters. This confirms again the relevance of low frequencies for pedaling imagery recognition. Notice that our approach can be used to automatically select the subject-specific frequency bands without a supervision procedure. This unsupervised property of both subject-specific band location and Riemannian geometry, makes our system suitable for motor imagery recognition, whereas reference sensors such as accelerometers and goniometers are not able to accurately mark on the raw EEG when the motor imagery action occurs. Furthermore, it is reported that stroke patients can present attention deficits [86,87], which may affect the BCI performance. Thus, unsupervised methods such as Riemannian geometry may be a suitable solution to minimize this effect. Here, we used a supervised method for feature selection after applying Riemannian geometry, however its performance may be affected by attention deficits of stroke patients. For this reason, other unsupervised methods for feature selection will be explored in future works.
On the other hand, for BCIs based on motor imagery tasks, the selection of the window size for processing is still a challenge [33,66]. Generally, window sizes from 1 to 2 s have been used after the cues for processing [33,47,66]. In this study, we evaluated several widow sizes of 0.5 s, 1.0 s, 1.5 s, 2.0 s, 3.0 s, and 4.0 s getting good performance from 0.5 s to 1.5 s, which agrees with [33]. As a highlight, processing windows of 0.5 s provided the highest performance for our recognition system.

Limitations
Only ten healthy subjects were used here to evaluate our approach. Then, future works must be carried out with more healthy subjects, including chronic post-stroke patients, in order to generalize our results. Moreover, our experimental protocol should be modified using random cue sequence.
Although our system based on subject-specific bands achieved promising results, future works will be needed to improve the subject-specific band location. In addition, it is worth mentioning that the time-frequency representation is affected by both sampling rate and window size for the signal processing. It is worth noting that lower-frequencies introduce more noise and artifacts, such as ocular and movement artifacts. Thus, other analyses removing possible artifacts of lowfrequencies should be carried out for comparison.
Likewise, our approach uses FNCA for feature selection, which performance may be affected by attention deficits of stroke patients.

Conclusion
To conclude, this work presented a system based on Riemannian geometry to recognize pedaling motor imagery, which improved its performance by applying methods to obtain subject-specific bands (for pre-processing) and the best features (over Riemannian features). We consider that our recognition system can be used with the traditional low-pass filter of Butterworth (2nd order) to obtain main components of the EEG from DC to 30 Hz, showing promising results for pedaling imagery recognition, which require more studies. This work is the first stage of a research project from the Federal University of Espirito Santo (UFES/Brazil), which aims to build a system based on a BCI to command a commercial motorized pedal (named ACTIVcycle) for neuro-rehabilitation purposes. In future works, our recognition system and the motorized pedal will be integrated to provide lower-limb rehabilitation.

Acknowledgments
This study was financed in part by FAPES (