A transfer learning-based feedback training motivates the performance of SMR-BCI

Objective. Feedback training is a practical approach to brain–computer interface (BCI) end-users learning to modulate their sensorimotor rhythms (SMRs). BCI self-regulation learning has been shown to be influenced by subjective psychological factors, such as motivation. However, few studies have taken into account the users’ self-motivation as additional guidance for the cognitive process involved in BCI learning. In this study we tested a transfer learning (TL) feedback method designed to increase self-motivation by providing information about past performance. Approach. Electroencephalography (EEG) signals from the previous runs were affine transformed and displayed as points on the screen, along with the newly recorded EEG signals in the current run, giving the subjects a context for self-motivation. Subjects were asked to separate the feedback points for the current run under the display of the separability of prior training. We conducted a between-subject feedback training experiment, in which 24 healthy SMR-BCI naive subjects were trained to imagine left- and right-hand movements. The participants were provided with either TL feedback or typical cursor-bar (CB) feedback (control condition), for three sessions on separate days. Main results. The behavioral results showed an increased challenge and stable mastery confidence, suggesting that subjects’ motivation grew as the feedback training went on. The EEG results showed favorable overall training effects with TL feedback in terms of the class distinctiveness and EEG discriminancy. Performance was 28.5% higher in the third session than in the first. About 41.7% of the subjects were ‘learners’ including not only low-performance subjects, but also good-performance subjects who might be affected by the ceiling effect. Subjects were able to control BCI with TL feedback with a higher performance of 60.5% during the last session compared to CB feedback. Significance. The present study demonstrated that the proposed TL feedback method boosted psychological engagement through the self-motivated context, and further allowed subjects to modulate SMR effectively. The proposed TL feedback method also provided an alternative to typical CB feedback.


Introduction
A brain-computer interface (BCI) is an artificial output system that measures brain activity and converts it to replace and enhance the output of the central nervous system [1]. Most non-invasive BCIs rely on electroencephalography (EEG) decoding to communicate the user's intentions to external devices [2]. The sensorimotor rhythm (SMR) refers to an oscillatory idle rhythm of synchronized EEG signals with a frequency range of 8-13 Hz appearing over the sensorimotor cortex (electrodes C3 and C4). A decrease in SMR is associated with imagining body movements, a phenomenon known as event-related desynchronization [3]. SMR-BCI has attracted attention from researchers because of its clear physiological characteristics [4][5][6][7]. The SMR-BCI control is presented in a vertical pyramid with the users at the base, the machine learning technology at the second level, and the brain-controlled device at the top [8]. The user has to generate certain patterns of brain activity, and then, machine learning identifies the feature space that maximizes differentiation of users' intentions. The user's ability to not only produce distinct brain activity patterns voluntarily but also to stabilize them within one task is widely regarded as an SMR-BCI skill [9]. It has been shown that users could learn to improve their SMR-BCI skills gradually with the help of feedback training [10], including visual feedback [11,12], haptic feedback [13,14], auditory feedback [15], virtual reality (VR) feedback [16,17], in-game feedback [18] and multi-mode feedback [19].
A common assertion is that the cognitive mechanism responsible for 'self-control of brain activity' is motor skill learning. However, cognitive mechanisms such as motivation, locus of control towards technology, and spontaneous strategies also influence BCI learning [20]. Hence, these factors may represent key cognitive processes beyond skill learning. In the study of motivation, the focus of the current study, a distinction is commonly made between extrinsic and intrinsic motivation; according to dual-process theory, these are the automatic and controlled components of motivation, respectively. Intrinsic motivation is sensitive to self-efficacy beliefs, and extrinsic motivation is more sensitive to reward [21]. There is evidence that self-control of brain activity is influenced by both intrinsic and extrinsic motivation. Extrinsic motivation influences BCI performance when the monetary reward has been manipulated. Subjects received a money reward, that is 25, 50 or 0 euro cents, for every correctly spelled letter with the BCI to manipulate motivation. It turned out that extrinsic motivation may be a psychological variable affecting P300 amplitude [22,23]. Some studies examined the effect of an intrinsic motivation induction, called 'motivation-to-help' , on BCI performance in healthy subjects. The 'motivation-group' were informed of the importance of their full attention and effort, and then read a vignette describing the significance of BCI research for amyotrophic lateral sclerosis people who may eventually depend on BCI, before the copy-spelling P300-BCI. The motivation group obtained a higher classification accuracy than the non-motivation group [24].
Cognitive task-based BCI, such as mental calculation and motor imagery (MI), should be more subjected to the effects of motivation [25]. Sollfrank et al proposed enriched multi-mode feedback by using a liquid floating through a visualization of a funnel that was controlled by classification results and the stability of the EEG signals. They found an enhancement of motivation and minimization of frustration in subjects throughout the sessions, but significant improvement in performance was seen only during the first session [19]. However, the combination of visual and auditory feedback may not be effective since the fusion of the multiple incongruent sources of information may increase cognitive load and the number of errors [26]. Alternatively, BCI training could be gamified [16] or performed in a VR environment [18]. It appeared that VR and gamified BCI improved users' motivation and training performances on an average level. Bonnet et al designed a multi-user BCI video game that increased motivation, engagement, and possibly motor imagery performances for some users [27]. Games can be intrinsically motivating and are often played in relevant environments with novel appeal, challenges, and aesthetic value. However, cognitive overload or additional processing effects could occur in BCI game applications [28].
Intrinsically motivated subjects may be more likely to engage in the task willingly, improving their skills [29]. To this end, in the current study we gave subjects feedback designed to increase intrinsic motivation. The subjects were provided information about their performance in the prior training session, and we assumed that seeing information about past performance would increase subjects' selfmotivation to improve. Duan et al proposed a visual feedback method that intuitively reflects the distribution of EEG signals as feedback points in Riemannian geometry [30]. However, nearly 160 feedback points within one run were cleared from the screen at the beginning of the next run. For our purposes, it would be ideal for the prior compression objects to remain on the screen. In fact, the physiological and psychological states, environment, instrument (channel locations) and subjects' mental strategies may change quickly, leading to the non-stationarity of EEG data [31]. However, the EEG measurements in the prior training and the current training do not belong to the same domain, and hence, the distance between points cannot correctly reflect the similarity between EEG measurements. In other words, the current feedback training method would not be able to make use of the previous data due to the large differences between sessions, which greatly influence the effect of feedback training.
Many researchers applied transfer learning (TL) to overcome the session-to-session variation in SMR-BCI [32]. Since the covariance matrices of EEG signals naturally lie in the Riemannian geometry of the manifold of symmetric positive definite (SPD) matrices, instead of in the Euclidean space, Riemannian approaches have become popular in EEGbased BCIs [33]. Rodrigues et al proposed the Riemannian procrustes analysis (RPA) to accommodate covariant shifts of EEG signals recorded during different sessions [34]. RPA matches the statistical distributions of the covariance matrices of the EEG trials from different sessions, using transformations, namely translation, scaling, and rotation, in sequence. However, the use of TL in feedback training to achieve more effective feedback has not been studied so far.
Using Riemannian affine transformation in the current study, we made data from different runs comparable in order to display the feedback from the prior and current runs on the same screen, and thus, provide the context of self-motivation for subjects [31]. Specifically, we calculated the pair-wise Riemannian distance among SPDs, and the distance matrix was dimension-reduced and visualized on the screen as objects (points and rectangles). Then, the centers of mass of two classes for all historical EEG data were estimated, and the historical centers of mass and the SPDs of the current run were affine transformed into the identity matrix in order to realize the presentation of historical and current information in the same space. The feedback was updated on the screen every second during a MI task, and the subject could learn to modulate their sensorimotor rhythm effectively with the help of both the historical performance and the feedback on the current run.
The main goal of this study was to investigate whether the favorable subjective psychological factors provided by SMR-BCI feedback, namely selfmotivation, could make the learning process more efficient or not. To this end, two groups of subjects were trained with two different feedback protocols, namely, TL based feedback and typical CB feedback, during a three-day training. We quantified the subjects' performance in terms of 'online, actual' performance, class distinctiveness, and the neurophysiological discriminancy as well as the behavior metric for the two groups. We hypothesized that subjects under TL feedback will be more self-motivated to upgrade their SMR-BCI skills over the three training sessions, as compared to the CB method.

Subjects
Twenty-four healthy subjects volunteered to participate in the SMR-BCI feedback training experiment. Six of them were female. The mean age was 20.71 ± 1.00 years, ranging from 19 to 23 years of age. All subjects were right-handed with normal or correctedto-normal vision, and had not participated in the BCI experiment before. All experimental procedures were approved by Northwestern Polytechnical University Medical and Experimental Animal Ethics Committee. Each subject learned about the content and signed informed consent before the experiment. In a between-subject design, 24 healthy BCI naive subjects were evenly and randomly provided with either typical CB feedback (n = 12) or TL feedback (n = 12). The mean age of subjects in the TL and CB feedback groups was 21.00 ± 1.13 years and 20.42 ± 0.79 years, respectively. Three females were included in each group. A t test revealed no significant difference in age between the two groups (p = 0.206). Figure 1(a) illustrates the SMR-BCI training experimental structure. The experiment was designed in a 2 × 3 mixed way, with three sessions as the withinsubjects factors and TL and CB feedback types as the between-subjects factors. Subjects were trained for three sessions on separate days within one week. During one session, they had to perform ten runs with individual breaks. At the beginning of each run, subjects were asked to stay still and resting for 10 s with their eyes staring at the fixation in the center of the screen. The 10 s resting-state data were later used for artifact detection. Each run, lasting about 5 min, consisted of 40 trials with 20 trials per class presented in randomized order. All subjects underwent an open-loop calibration session in which they performed the task without any feedback, before the first training session. The calibration session consisted of four runs, with each containing 30 trials. Data collected in the calibration session were used for presetting the parameters and model presetting in the CB online training experiment.

Experiment setup
During the experiment, the subjects sat in comfortable chairs facing the computer screen, and the laboratory was kept quiet and at the appropriate temperature. The experimenter gave the satellite instructions, which included a reminder to stay still to minimize the noise of EEG signals, the task instruction, the time-line of each trial, an explanation of the training goals, and the meaning of feedback objects. Notably, the experimenter only prompted subjects to perform first-person kinesthetic left-and right-hand MI tasks but did not prescribe a specific strategy. Therefore, the subjects could fully explore the most disparate pairs of mental strategies according to feedback, such as imagining playing the piano using the left hand and playing basketball using the right hand. Figure 1(b) shows the trial time-line for the TL feedback method. Each trial started with a 'beep' sound to remind subjects to stay focused. After 1 s, a blue or red arrow that pointed either left or right appeared on the left side or right side of the screen, prompting the subjects to perform either left hand or right hand-MI. The arrow remained on one side of the screen throughout the MI task period. The MItask lasted for 4 s, during which the visual feedback was presented to the subjects on the second, third, fourth, and fifth second. A detailed explanation of the TL feedback modality is described in 3.1. After a twosecond break, a new trial started.
The trial structure for the CB feedback protocol is shown in figure 1(c). Each trial started with a onesecond fixation cross. Then, an blue or red arrow pointing to the left or right appeared in the center of the screen and stayed there for 4 s. Feedback was provided through a light gray CB that moved horizontally to the left or right between two fixed arrows according to the classification values. The position of the CB was updated every 62.5 ms. Once it hit the small arrow on the left or right, it merged with the small arrow to form a long blue or red decision arrow. As soon as the decision arrow appeared, the MI task was suspended and the subjects rested until the subsequent trial began.

Data acquisition
Neuracle NeuSen W series wireless EEG acquisition system with a 10-20 system EEG cap was used to record the EEG signals. As shown in figure 2, EEG data from 30 electrodes (Fz, F3, F4, FCz, FC1, FC2, FC3, FC4, FC5, FC6, FT7, FT8, Cz, C1, C2, C3, C4, C5, C6, CP1, CP2, CP3, CP4, CP5, CP6, Pz, P3, P4, P5, P6) were collected during the experiment. EEG data from only 10 electrodes (FC3, FC4, C1, C2, C3, C4, C5, C6, CP3, CP4) were used for TL feedback [30], and EEG data from 15 electrodes (Fz, FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2, C4, CP3, CP1, CP2, CP4) were used for CB feedback [35]. The reference electrode was CPz, which was placed in the central-parietal area. All electrode impedance remained below 5 kΩ. The EEG signals were sampled at 250 Hz and band-pass filtered between 0.01 and 100 Hz. Figure 3(a) illustrates the feedback modality for the TL feedback method. The feedback objects appeared in the center of the screen, and an instruction arrow was displayed on the left or right side of the screen, with a black background. The feedback from an EEG segment of the current run was represented as small dot. Two centers of mass for dots of two classes were represented as large rectangles in order to provide a more intuitive illustration of the distinctiveness of the two classes. In addition, two small rectangles that appeared from the second run denoted the centers of mass for two classes of the previous runs self-motivating the subjects to learn better. During a run, new points appeared constantly, and the rectangle updated as new points entered. The feedback and instructions were presented using Mat-Lab psychtoolbox.  Two centers of mass of all previous runs appeared from the second run. The large rectangle in the first run was displayed on the feedback screen of the second run in the form of a small rectangle after being affine transformed. Both the large and small rectangles of the second run (marked with a white dashed box) jointly served as the centers of mass of previous runs of the third run and, thus, appeared as a small rectangle on the screen of the third run. For the subsequent runs, the centers of mass in the white dashed box of the last run were presented in the form of small rectangles on the current feedback screen. In addition, the bottom area of figure 3(b) illustrates that both the small and large rectangles did not appear until the ninth feedback point (as shown the red segment in the grey grid) in each run. Because the fluctuation of the EEG signals between the current run and the previous runs was relatively large in the beginning, four rectangles appeared until the transfer was stabilized. Notably, the ninth point refers to the counts of clean EEG segments (as shown the dark grey segment with a label), and did not include artifacts (which had been discarded).

The online processing method
EEG measurements were separated into four nonoverlapping 1 s segments during the MI period in each trial. Updated feedback was displayed at the end of each 1 s EEG segmentations. Figure 4 shows the flowchart of the TL feedback method. The conversion from the current EEG measurements into feedback objects included eight steps. The current 1 s EEG segment was examined by using the Riemannian Potato method. An EEG segment was rejected if the distance from the reference point exceeded the threshold [36].
If the EEG segment was rejected, a sharp beep was emitted to remind the subject to stay still and concentrate. If not, the EEG was retained and the following steps proceeded. We used an advanced training protocol. Specifically, the cleaned EEG segment was band-pass filtered within 8-30 Hz broadband in the first session and a 4 Hz wide subject-specific frequency band in the remaining two sessions. The subject-specific EEG band was chosen as a 4 Hz wide band corresponding to the highest Class Distinctiveness in the first session [30]. Next, we calculated the covariance matrix of the filtered EEG segment, obtaining an SPD matrix. The centers of mass from all SPDs in the current run of lefthand-MI and righthand-MI classes were initialized at the ninth trial and updated by the current cleaned SPD in the subsequent trials. Two centers of mass of two classes were calculated for SPDs belonging to the MI task period from all previous runs. Then, TL was employed to SPDs of the current run, two current centers of mass as well as two previous centers of mass. A symmetric (N + 4) × (N + 4) distance matrix was obtained by calculating the pair-wise distance of N SPDs, two current centers of masses as well as two previous centers of masses. Finally, the symmetric distance matrix was reduced into two dimensions using a visualization technique called diffusion maps. Notably, we calculated the similarity matrix (normalized covariance) of the Riemannian distance matrix as the affinity matrix in diffusion maps, in order to separate four rectangles and the point clusters, as the overlapping objects could affect the visual effects and make it more difficult for subjects to compare the distances between rectangles intuitively. The point cluster of lefthand-MI (blue) and righthand-MI (red) did not appear in a fixed place on the screen, and they rotated according to the direction of the second-column eigenvector of the normalized weight matrix in diffusion maps. Although dramatically changed Riemannian distance matrices caused the point cluster to rotate in different directions for each subject, they only paid attention to the relative distance between the red and blue point clusters, not their specific place on the screen.

TL method
An appropriate metric is necessary to determine the distance between EEG segments in order to measure their dissimilarity. The Riemannian distance and the corresponding norm given by Riemannian geometry could be applied in SMR-BCI training [30,37]. Because the cross-session changes can be understood as geometric transformations of the covariance matrices, the affine transformation provides an optimal process for solving the TL problem due to the affine invariance property of the Riemannian distance and Riemannian mean. Riemannian affine transformation was employed on the EEG covariance matrices of every run to center them with respect to  an identity matrix, making data from different runs align.
In an SMR-BCI feedback system, we registered a 1 s period of n-channel EEG measurement as, where n and L denote the number of channels and the sample points, respectively (n = 10, L = 250 in this experiment). The second-order statistical characteristics contain the separable information of brain state, especially the covariance matrix of the EEG measurement. The n × n symmetric covariance matrix of X(t) can be defined as where X denotes the average to mean of X. P is a symmetric matrix in which the diagonal and off-diagonal elements are the variance and covariance of X(t). The covariance matrix belongs to the SPD matrices. We set S(n) = {S ∈ R n×n , S = S T } as an ndimensional symmetric matrix space, and P(n) = P ∈ S(n), x T Px > 0, ∀x ∈ R n , x ̸ = 0 is the space of the set of N-dimensional SPD matrices, which turns out to be a smooth Riemannian manifold with nonpositive curvature.
For two points P 1 , P 2 ∈ P(n), The minimum length curve connecting P 1 and P 2 is called a geodesic, and the length of the geodesic from P 1 to P 2 (and vice versa) is defined as the Riemannian distance, whose closed solution is where ∥·∥ F denotes the Frobenius norm and λ i , i = 1, . . . , n denotes the eigenvalues of P −1 1 P 2 . Log(X) represents the principal matrix logarithm, which could be obtained by diagonalization of X. That is, where D and V are the eigenvalues and eigenvectors of X, respectively. The center of mass of the number of k SPDs {P 1 , . . . , P k } is a point P that minimizes the sum of the Riemannian distances to each matrix, which could be iteratively identified in the literature [38].
Affine invariance refers to the distance between two SPD matrices that is invariant concerning any linear invertible transformation in the data space. Specifically, the affine invariance property for Riemannian distance δ (·, ·) is illustrated as, where GL(n) = {C ∈ M(n)} denotes a set of invertible matrices. Accordingly, the invariance property for the center of mass could be illustrated as We estimated a reference matrix for every session, and then performed affine transformation of data using the reference matrix. Therefore, data points of the same task of both sessions would move in the same direction relative to the reference matrix inherited from the affine invariance of the Riemannian distance. Let C N1 and C 1 , . . . , C N2 be the covariance matrices observed in session1 and session2, respectively. We aligned the two datasets from ses-sion1 and 2 by transformation: where centers of massC (1) andC (2) are used as the reference matrix for session1 and session2, respectively. As a consequence, the centers of mass of both sessions are centered at the identity matrix, and notably, the distances between SPDs of the same session remain unchanged, due to the congruence invariance property of the Riemannian distance.
In the SMR-BCI feedback system, we denoted P whereP (pri) andP (cur) represent the centers of mass of all set of SPDs in the prior training and in the current run, respectively. In this way, the centers of mass of two categories of the prior training are shifted through the same TL as the current SPD matrices, and furthermore, observations belonging to both the prior and current training do not change their relative distances and geometric structure. were obtained, where each feature reflected the estimated power of a specific channel and frequency band. We then conducted a feature selection on the initial 345 features using the canonical discriminant spatial patterns (CDSPs) method [35]. Ten features that maximized the difference between the PSD of the two classes were selected for each subject. Next, we trained a classifier on the selected features using linear discriminant analysis (LDA). Notably, the parameters and classifier model had to be obtained offline using EEG data collected in the calibration session, and the classifier was not adjusted or retrained in the training process. The classification result was represented as decision = ±1; accordingly, the bar moved from its current position x, as x = x + decision * dx. The value of dx was positively adjusted according to the performance of each subject, and specifically, dx equaled 10 for a good-performance subject, while it equaled 5 for a low-performance subject. The subjects performed the MI task until the decision arrow appeared at the fifth second. The gray CB hit the left arrow if its center shifted to the left side, and vice versa.

CDSP
It is necessary to find the optimal features by the projection subject-specific spatial patterns that maximize the separability between different mental tasks. In this paper, canonical variate analysis (CVA) was used to extract CDSP between two classes based on 345 PSD features. Given the n i × c matrix, S i = (S i1 , . . . , S in i ) ′ , i = 1, 2, as the original PSD features of class 1 and class 2, where n i is the number of samples and c is the number of features.
Let S = (S ′ 1 , S ′ 2 ), the CDSP of S are the eigenvectors A of W −1 B in which eigenvalues λ are larger than 0. Where B represented the between-classes dispersion matrix, and W represented the pooled within-classes dispersion matrix, where m i and m are the class and total centroids, respectively, The new features are obtained by the projection, For the second step, we ranked the original features (channel-frequency pairs) given their contribution on the new space. A discriminant power (DP) measure was defined for each channel. DP could be computed as follows, where T = (t 1 , . . . , t c−1 ) represents the pooled correlation between the original features in S and the new features in Y.

Class distinctiveness
The distinctiveness of the two classes can be expressed by the Fisher criterion, where the distance and absolute deviation can be replaced by properties of Riemannian geometry [39]. We used the metric ClassDis to determine the distinctiveness of lefthand and right-hand EEG pattern pairs (class A and class B): whereP A andP B denote the centers of mass of EEG covariance matrices of category A and category B, respectively. σ PA and σ PB denote the corresponding mean absolute deviations, where the Riemannian distance δ (·, ·) was computed by equation (4). A higher ClassDis reflects smaller class dispersion and greater distance between classes.

EEG feature discriminancy
We also assessed the discriminancy of a given spatiospectral EEG feature for two MI tasks quantified by using the Fisherscore.
where µ and s 2 represent the mean and variance of EEG features of a specific EEG channel and frequency band among trials, respectively, and the subscripts A and B denote two categories. EEG signals were spatially filtered with a Laplacian derivation. The power spectral density was computed every 62.5 ms using the Welch method with five (75% overlap) internal Hanning windows of 500 ms and was logtransformed. In addition, the Fisherscore was quantified in the artifact-free EEG signals.

Self-assessment questionnaire
Before starting with each training session, BCI motivation was assessed using a questionnaire. Subjects were asked to fill out the adapted version of the questionnaire for current motivation (QCM) with no time limit. The original QCM consisted of 18 items used to assess four motivational factors (anxiety, probability of success, interest, and challenge) [40]. The adapted version of the QCM measures four motivational factors in BCI training, namely mastery confidence, incompetence fear, interest, and challenge. Subjects rated the extent to which each statement applied to them, using a five-point Likert-type scale [41]. Some items are reverse-scored. The items on each factor were averaged to create factor scores.

Statistical analysis
A one-way ANOVA for repeated measures was used to estimate the influence of three-day training on subjects' online SMR-BCI performance. Performance was measured as the 'actual,online' ClassDis for the TL group, and accuracy of the decision results for the CB group. A two-way repeated measures ANOVA with 2 × 3 mixed design were used for statistical comparison. Groups (TL, CB) and Session (1, 2, 3) were the between-subjects and within-subjects independent variables, respectively, and the metrics, including ClassDis, Fisherscore, and behavior motivation factor score were the dependent variables. Mauchly's test was used to check for sphericity, and any violations of sphericity were corrected using Greenhouse-Geisser epsilon values. Bonferroni adjustment was used to define alpha when there were multiple comparisons among three sessions. The learning curve was the corresponding linear fit of ClassDis for 30 runs. The training effects are reported as Pearson correlation coefficients with significance at the 95% confidence interval in Student's t-test distribution.

Online performance
The 'actual, online' performance was evaluated by the accuracy of decision results in the CB group and the ClassDis metric in the TL group. ClassDis reflected what the subjects were seeing from the visual feedback, and thus, the EEG covariance matrix was computed on the broadband (8-30 Hz) for sessions1 and on the subject-specific bands for session2 and 3. The mean 'actual, online' SMR-BCI performance as a function of the two feedback methods in three training sessions was shown in figure 5. The ClassDis averages in the TL group (n = 12) were 0.523 ± 0.261, 0.519 ± 0.263 and 0.596 ± 0.371, respectively, for three sessions. ClassDis decreased slightly in session2, and bounced back to the highest level afterward, with an overall increase of 14.0%. Repeated measures ANOVA showed no significant difference among three sessions [F(1.165, 12.817) = 3.088, p = 0.096, η 2 = 0.219].
Accuracy in the CB group (n = 12) was calculated as the ratio between the number of correct hits and the total number of trials. The averages for accuracy in the three sessions were 0.532 ± 0.072, 0.545 ± 0.085 and 0.539 ± 0.055. Specifically, the accuracy rose in session2 followed by a decrease for the subsequent session3, with an increase of 1.32% after the three-day training. No significant difference was found among the three sessions [F(2, 22) = 0.504, p = 0.611, η 2 = 0.044].

Class distinctiveness results
In addition to being an 'actual, online' measure of performance in the TL group, the ClassDis metric was used to make the results of TL feedback and CB feedback group comparable. The processing used the cleaned raw EEG data, and the frequency band and channels selection were consistent between two groups. EEG signals were band-pass filtered in broadband  Hz) before calculating the covariance matrix. A t test for ClassDis did not reveal any significant differences between the TL and CB feedback groups in the calibration session (TL group: 0.464 ± 0.204, CB group: 0.459 ± 0.228, p = 0.125), indicating that the groups had similar levels of performance before feedback training. We divided twelve naive subjects into subjects of good SMR-BCI performance and of low performance according to the median ClassDis of the calibration session. The subjects were classified before training to avoid the influence of the specific imagery strategy or the choice of feature space or classifier. s2, s4, s8, s9, s10, s12 were good-performance subjects in TL group, and s14, s15, s20-s22, s24 were good-performance subjects in CB group.
The average ClassDis of 12 subjects in two feedback groups is shown in figure 6(a). The red (TL group) and blue (CB group) curves represent the learning curve for 30 runs and the black line represents the corresponding linear fit. The progression is estimated by analyzing the slope of the regression line of ClassDis computed across runs or sessions. A significant positive Pearson correlation between ClassDis and the run index was found in the TL group, which indicated a significant training effect on class distinctiveness (r = 0.812, p < 0.001, N = 30). However, in the CB group, the slope of the linear fit was negative  (r = −0.350, p = 0.058, N = 30), which implied that training did not work during the three-day training period. The average ClassDis for 12 subjects who trained with TL feedback on the first, second and third sessions were 0.523 ± 0.261, 0.584 ± 0.307, and 0.672 ± 0.404, respectively. For the 12 subjects who trained with the CB feedback method, the averages were 0.439 ± 0.177, 0.421 ± 0.102 and 0.418 ± 0.094, respectively. See figure 6(b).
The statistical analysis yielded a significant session × feedback interaction [F(1.385, 30.468) = 5.136, p = 0.021, partial η 2 = 0.18], with Greenhouse-Geisser corrections. There was no significant main effect of session or group. Post hoc comparisons revealed significantly higher performance for the TL feedback group in session3 as compared to session2 by 0.088 (95% CI: 0.010-0.165, p = 0.024, Bonferroni test) and compared to the initial session by 0.149 (95% CI: 0.024-0.274, p = 0.016, Bonferroni test). This corresponded to a 28.5% of growth rate in the TL group. No significant differences between sessions were found in the CB group. The TL feedback revealed an uptrend while the CB feedback decreased slightly. There was a significant difference was found between the two feedback methods in ses-sion3, namely the ClassDis of TL group was 60.5% higher than the CB group (TL-CB: 0.253, 95% CI: 0.005-0.502, p = 0.046).
The learning metric ClassDis per subject is shown in figure 7 for the TL feedback group. The subjects who showed a statistically significant increasing trend in terms of the learning metric were considered to be 'learners' . We found that 41.7% (5 out of 12) of the subjects in the TL group were 'learners' (s1 : r = 0.575, p = 0.001; s4 : r = 0.532, p = 0.002; s6 : r = 0.529, p = 0.003; s8 : r = 0.719, p < 0.001; s12 : r = 0.395, p = 0.031), among whom three had been identified at calibration as goodperformance and two as low-performance subjects. ClassDis for half of the 12 subjects stayed nearly unchanged. s11 showed a significant decreasing trend over the runs, and was considered a 'non-learners' (r = −0.448, p = 0.013). The 'non-learners' were low-performance subject, which may be known as a BCI illiteracy.
In the CB feedback group (as shown in figure 8), there was only one 'learner' (s15 : r = 0.401, p = 0.028). There were nine subjects who remained at the same level with a few initial runs, and s14, s19 were 'non-learners' (s14 : r = −0.614, p < 0.001; s19 : r = −0.439, p = −0.015). The 'learners' were goodperformance subjects, whereas the 'non-learners' included both good-and low-performance subjects. The good-performance subjects would be expected to at least maintain the performance level at ses-sion1, unless training with CB feedback reduces the SMR characteristics inherent in good-performance subjects.

Neurophysiological evidence of subject learning
In order to explore the neurophysiological evidence of subjects' learning effects, we employed the Fisherscore metric, which measures the discriminancy of EEG features in spatial and spectral distributions for two MI classes. Figure 9(a) shows the averaged 'broad-band' (8-30 Hz) topographic maps of EEG discriminancy over 30 channels among 12 subjects for both the TL and CB feedback groups. A relevant SMR feature can be observed in the white dashed circle, which denotes 10 channels in the TL feedback group. Specifically, the averaged EEG discriminancy among C2, C4, CP2 and CP4, all of which were physiologically relevant to right-hemisphere MI topography, showed a continuous improvement, with a 49.8% increase from session1 to session3. Surprisingly, the averaged EEG discriminancy among C1, C3, CP3 and CP1, all of which were physiologically relevant to left-hemisphere MI topography, also increased by 67.9%, from very few distinguishing SMR features to a 0.268 Fisherscore. Regarding the CB feedback group, an SMR feature was also observed in the white dashed box, which denoted activity on 15 channels during the CB task. The averaged EEG discriminancy showed a fluctuation change among four channels with a 0.5% decrease in the left-hemisphere and a 19.1% reduction in the right-hemisphere from ses-sion1 to session3.
The boxplot in figure 9(b) portrays the Fisherscore over ten channels for the TL feedback group and 15 channels for the CB feedback group as training went on. The average Fisherscores of the 12 subjects who received TL feedback training were 0.176 ± 0.212, 0.214 ± 0.216 and 0.264 ± 0.219 on the first, second and third session, respectively, with a clear upward trend. By contrast, the average Fisherscore of the CB feedback group remained at about 0.10 for three sessions. The averages in the first, second and third sessions were 0.126 ± 0.083, 0.107 ± 0.080 and 0.107 ± 0.073, respectively. The statistical analysis yielded a significant session × group interaction (F(2, 44) = 5.851, p = 0.006, η 2 = 0.210), in which the covariance matrix of dependent variables was equal according to Mauchly's test. Post hoc analyses were conducted to interpret this interaction. In the TL feedback group, the Fisherscore on the third session of training was significantly higher than on the first day of training by 0.089 (95% CI: 0.020-0.157,p = 0.02), and significantly higher than the second session of training by 0.051 (95% CI:0.000-0.102, p = 0.049). However, no significant difference was found between each session concerning the CB feedback group. The TL feedback showed a higher level of performance in comparison with the CB group at all three sessions, with the difference being greatest at session3, when the TL group outperformed the CB by 0.157 (95% CI: 0.019-0.296, p = 0.028). There was no significant main effect of session or group. Moreover, we estimated the relationship between EEG discriminancy and the metric ClassDis. The overall discriminancy of the Fisherscore (average of 10 channels for TL, 15 channels for CB) was highly correlated with ClassDis (N = 72, r = 0.939, p < 0.001), suggesting that the increased SMR modulation was crucial for enhanced SMR-BCI performance. Figure 10(a) shows the 'broadband' EEG discriminancy topographic maps for each subject in the TL group. s1, s2, s4, s6, s8 and s12 showed relevant SMR features in either left-hemisphere (s12) or righthemisphere (s1, s4, s6) or both hemispheres (s2, s8). By contrast, s3, s5 and s11 showed subtle and unstable EEG discriminancy throughout the training process. Slight SMR features could be found in some sessions for s7, s9 and s10. The averaged Fisherscore showed a magnificent positive growth rate (above 100%) in left-hemisphere of s1 (830%), right-hemisphere of s6 (667%), both hemispheres of s7 (166% and 147%), left-hemisphere of s8 (198%), both hemispheres of s9 (176% and 173%), both hemispheres of s10 (150% and 143%), and both hemispheres of s12 (362% and 704%). Figure 10(b) shows the EEG discriminancy topographic maps for each subject in the CB feedback group. s13, s14, s16, s17 and s23 showed relevant SMR features in either left-hemisphere (s23) or right-hemisphere (s16, s17) or both hemispheres (s13, s14). By contrast, s18, s19, s21, s22 and s24 showed subtle and unstable EEG discriminancy throughout the training process. Slight SMR features could be found in some sessions for s15 and s20. The averaged FS showed a magnificent positive growth rate in left-hemisphere of s13 (426%), left-hemisphere of s15 (137%), and right-hemisphere of s18 (142%). However, relevant SMR features, which were observed in the initial session in s14, s16, s17, s23 and s24, decreased during training, in  Figure 11 depicts the results of four 2 (group) × 3 (session) mixed ANOVAs. The four dependent measures were scores for mastery confidence, interest, challenge, and fear of incompetence. For mastery confidence, there ANOVA showed a was a significant interaction effect between group and session in mastery confidence factor [F(2, 44) = 4.441, p = 0.018, η 2 = 0.168]. Post-hoc comparisons revealed a significantly lower score for the CB feedback group in session2 as compared to the initial session by -1.667 (95% CI: −3.043-0.290, p = 0.015, Bonferroni test). In terms of interest factor, ANOVA showed a significant main effect of sessions, and further posthoc analysis showed a significant difference between session3 and session1 (session3-session1: 0.95 895% CI: 0.248-1.669, p = 0.010, Bonferroni test). ANOVA on challenge showed a significant main effect of two groups, in which scores in TL group were higher than CB group by 0.847 for all sessions [F(1, 22) = 4.118, p = 0.048, η 2 = 0.162]. No significant interaction effect was found between group and session, nor was there any main effect on fear of incompetence.

Discussion
The proposed approach took advantage of the TL method to align EEG data from different runs in order to allow for a self-motivation SMR-BCI feedback training context. TL was performed at the beginning of new runs, and the Riemannian distances among transformed SPDs were calculated iteratively, and translated into the feedback visualization. Besides, Figure 10. The broadband topographic maps of discriminancy across three training sessions per subject on 30 EEG channels over the sensorimotor cortex. Red indicates high separability between left hand-MI and right hand-MI. Discriminancy of each channel is quantified as the Fisherscore of PSD for two MI tasks in the specific frequency band.
only when machine learning adaptation was enabled at the beginning of new sessions and stopped when the non-stationarity effect was alleviated, could the learning capacities of the subjects be better promoted [8]. The feedback objects of prior centers of mass in the proposed training procedure appeared until non-stationarity effects were already alleviated. A self-motivation mechanism could present a key cognitive process beyond motor skill learning to the selfcontrol of SMR-BCI. TL, as a feedback method, could present the historical and current performance of the subjects in the same space, enabling a context for subjects to achieve self-motivation and, thus, impact the learning efficiency of SMR-BCI.
The quantitative evidence showed that subjects could improve their SMR-BCI skills using the TL feedback method. Although the online performance of the TL group did not significantly increase after a three-day training, the linear fit of the averaged learning curve for TL group moved upward, and participants showed significantly higher performance in both the second and third sessions than the initial session in terms of the ClassDis and Fisherscore metric. In addition, subjects developed better SMR-BCI skills by using TL feedback than using the typical CB feedback method. At session3, the TL group outperformed the and CB group based on the ClassDis and Fisherscore metric.
We found additional advantages of the proposed TL feedback method as we analyzed the results. Firstly, we noticed that the advanced training protocol brought about a positive influence in the long term. The average 'actual, online' performance for the TL group decreased in session2, which may due to that subjects felt difficulty in separating point clusters on the first run of the second session. However, in the last session, subjects adapted to the narrower frequency band and their performance improved. Secondly, we found that the TL feedback method improved SMR-BCI skills not only for the lowperformance subjects, but also good-performance subjects who may be affected by the ceiling effect. Specifically, 'learners' in the TL group included both good-and low-performance subjects. Thirdly, the TL training procedure was very effective in bringing up an emerging SMR discriminancy. The Fisherscore in the TL group substantiated a considerable and significant enhancement of the selected spatio-spectral feature (sensorimotor areas, 8-30 Hz) for session2 and 3, compared to session1. In addition, a relevant SMR discriminancy could be observed in about half of the subjects in the first session for both TL and CB groups. As the experiment continued, 58% of the subjects in the TL group showed increases higher than 100%, whereas only 25% of the CB group showed improvement and the rest showed decreases in performance to some degree.
Moreover, the results of the adapted QCM questionnaire that the proposed TL feedback protocol moderately boosted subjects' motivation. Subjects in the TL group experienced more challenges and showed stable mastery confidence, as compared to the CB group. Nijboer et al demonstrated that challenge and mastery confidence are the motivational factors most related to SMR-BCI [41]. Consequently, the increased challenges and stable mastery confidence suggested that motivation grows as the feedback training continues. In addition, training with either feedback methods increased subjects' interest compared to the initial session, and thus, possible 'frustration' did not affect motivation.
The effects of the TL feedback method were greater than those generated in a study of the online data visualization feedback method [30], which is a non-TL-based feedback method that intuitively reflects the EEG distribution in Riemannian geometry in real time. The major difference between the two methods is whether to affine transform the historical SPDs and then display them on the feedback screen. The positive correlations between the 'broadband' ClassDis and the run index in the current study and [30] were both 0.81. Nevertheless, [30], which also used a three-day training program, found no significant improvement on the last day. Furthermore, 20% of the subjects were non-learners in [30], whereas only 8.33% were non-learners in the TL method. In terms of the 'broadband' Fisherscore topographic maps, 50% of the subjects in TL group showed stable or increased SMR features, which was higher than the 30% rate reported in [30]. Moreover, 50% of subjects in the TL group showed a positive growth rate throughout the training process, whereas only 30% of subjects showed positive increases in [30]. We may conclude that the efficacy of the proposed TL feedback training was moderately attributed to the psychological engagement due to the increased self-motivation.
We depict in figure 12 the projection in the 2dimensional space of SPDs of current runs and the centers of mass of previous runs before and after the TL. Figure 12(a) shows that the good performance subject s2 in the original space, the small rectangles and large rectangles were located on the opposite sides of the screen, which meant the distance between the two centers of mass in the 11th run (the first run in session2) and in session1 were both large. In contrast, four rectangles were located in the same position in the transformed space ( figure 12(b))for the convenience of visual comparison. Furthermore, as for a low-performance subject s6, because SPDs of different sessions at a long-distance were visualized in a two-dimensional space of a small scale, the feedback dots of the current run that was proximity were compressed into a line, and overlapped with each other (figure 12(a)), making it very difficult to identify visually. It was quite apparent that data coming from different sessions presented in a similar shape after TL. Apart from the shape, the shift between four rectangles had also been removed. It could be concluded that from one session to another, what it is changing can be captured in a 'reference state' whereas SPDs moved in a consistent direction. Notably, the projection was accomplished by DM, which slacked the distance between data points not in the same neighborhood while maintaining the local structure. It was clear that the TL and the projection method worked together in order to make cross-session visualization efficient.
In addition to improving SMR learning skills, the feedback training value of BCI added for children's immersion, creativity and emotional skills has been well documented in studies of children with neurodevelopmental disorders [42]. However, the cognitive load and emotional states such as motivation, frustration and distractions greatly affected the EEG recording of children [43]. In the follow-up research, we plan to employ the TL-based feedback training method for neurofeedback training in children's affective states [44,45].

Conclusion
In this study, the TL feedback protocol was proposed for SMR-BCI training. This method makes data from different sessions align, aiming to display the feedback of the prior and the current run on the same screen and thus, provide additional guidance for subjects' self-motivation. We assessed the effects of TL feedback in a between-subject design SMR-BCI training experiment. The increased motivation factors proved that the proposed feedback protocol moderately boosted subjects' motivation. The class distinctiveness and the SMR discriminancy results showed that TL group's performance in the second and third sessions exceeded their performance in the initial session, which implies that subjects enhanced their SMR-BCI skills using the proposed TL feedback method. By the third training session, a significant difference was observed between the TL group and the CB group, with an overall training effect among the TL group with 41.7% 'learners' . The results showed that the proposed method was superior to the typical CB feedback. We also concluded that the efficacy of the proposed TL feedback method is partially attributable to psychological engagement in the self-motivated context.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.