Predicting nonverbal intelligence level from resting-state connectivity: a neural networks approach

This article is devoted to the development of a model of an artificial neural network for predicting the level of nonverbal intelligence according to the EEG of the brain. Cognitive functioning relies on the synchronization between different brain structures. However, it is still unclear how individual differences in intelligence are related to the global characteristics of information transmission in brain networks. Resting-state functional connectivity studies show the association of patterns of interactions between brain regions from people and different levels of nonverbal intelligence. In this study, we present a process of development of a neural network model used to predict the level of nonverbal intelligence based on EEG data of the brain. We have developed a fully-connected neural network to predict the level of nonverbal intelligence.


Introduction
According to the neural efficiency hypothesis and the network neuroscience approach [1,2] one's level of the intelligence is determined by the extent the network of the brain interactions is energyefficient.
Graph theory is used to analyse the topology of brain networks [1]. The application of the graph theory to the brain data showed that brain networks are organized in the "small world" type, characterized by a small average path length and a high degree of clustering between the vertices of the graph. Such organization is hypothesized to optimize the exchange of information between brain regions [3].
The global characteristics of functional connectivity in a resting-state are shown to be a stable individual characteristic [4] and are linked to the level of intelligence [5,6].
Since the remote brain regions interact via frequency-specific oscillatory communication [2,7], it is possible to assess the nature of their interaction using global brain connectivity characteristics calculated from EEG data.
For example, a study by Langer and co-authors [8] found that nonverbal intelligence was associated with functional connectivity indicators derived from EEG data recorded in a resting state. The authors showed significant correlations between Raven Progressive Matrix scores and the alpha band brain networks graph metrics: positive for Small World Index and the clustering coefficient, and negative for the characteristic path length. In another study [9] the relationship between nonverbal intelligence and EEG-based graph metrics was replicated, but only for the graphs with weak brain connections included. Thus, the patterns of the relationship between nonverbal intelligence and frequency-specific features of functional connectivity of the brain at rest is still not fully clear.
The electroencephalogram (EEG) signal can be used to mentally assess human activity. However, scientists have been conducting research in this area for a long time. Various traditional and new data processing and analysis methods are used to assess performance, mental workload and task engagement based on EEG signals.
In [10] Bashivan et al propose a new approach to the study of representations from multichannel EEG time series and demonstrate its advantages in the context of the task of classifying mental load. An empirical assessment of the cognitive load classification problem has shown a significant improvement in classification accuracy compared to current approaches in this area.
Friedman et al conducted a research [11] that aimed to use EEG to determine the cognitive load on subjects during intelligence tests. The research looked at several machine learning algorithms. The results showed that XGBoost was more accurate than simple linear regression and Random Forest models. However, with a larger dataset, a personalized ANN can potentially predict better than XGBoost.
In [7] authors reviewed 156 articles that applied deep learning to EEG results, published between January 2010 and July 2018, covering a variety of applications such as epilepsy, sleep, brain-computer interaction, and cognitive and affective monitoring. The average gain in accuracy for deep learning approaches over traditional baselines was 5.4% across all relevant researches.
Kay Gregor Hartmann et al describes the GAN as a structure for generating electroencephalographic signals from the brain [12] . The EEG-GAN structure generated naturalistic EEG results. Thus, this could open up a number of new scenarios for generative applications in the neurobiological and neurological context, such as data augmentation in brain-computer interaction tasks or the restoration of damaged data segments. The ability to generate signals of a certain class may also open up new possibilities for exploring the basic structure of brain signals.

Methods
A sample of 205 students between the ages of 18 and 25 were recruited on a voluntary basis. Nonverbal intelligence has been estimated using the Raven Progressive Matrices [13]. The EEG was recorded using 64 channel Actichamp system (Brain Products, Germany) in the resting state during 10 minutes (2 minutes open and closed eyes). After clearing the EEG records of artifacts (no more than 15% of the record was deleted for each of the participants), the functional connectivity of the reconstructed sources of brain activity was calculated. Reconstruction of the sources of electrical activity of the brain was performed using a standard source localization pipeline from the MNE package. A weighted phase lag index (wPLI) [14,15] calculated using mne Python software [16] was used to estimate synchronization between signal pairs. On the next step, the adjacency matrix was constructed from the pairs of synchronizations measures between the sources.. Further, functional connectivity matrices were calculated for each participant of the study based on pairwise comparison of wPLI sources.

Neural network
To solve the problem of predicting the level of nonverbal intelligence on the basis of connectivity matrices, we decided to apply a neural network approach. Choosing a neural network topology is not a trivial task. For a more accurate search for the hyper-parameters of the neural network, we used the tools of the hyperopt [17] and hyperas [18] libraries.

Preprocessing
Data preprocessing was performed as follows. EEG frequency and eye positions (open / closed) were converted into dummy variables. We used Robust Scaler to prevent possible incorrect data in connectivity matrices from occurring. The original dataset was split into training, validation, and test  [13]. Brain 64-channel EEG data were received from each participant in the condition of closed and open eyes and converted into connectivity matrices, then the global characteristics of the graph were calculated: characteristic path length, cluster coefficient, small-world index, modularity, eigenvalue centrality, diameter, closeness centrality.
The total size of the dataset was 2080 records. Each record contains data on the occlusion / openness of the eyes, the frequency of the device and the elements of the connectivity matrix. The total number of features is 2285. Fully-connected neural networks were used to build regression models.

Topology of neural network
The choice of the neural network topology was carried out using a large number of experiments with various network hyper-parameters. We used the hyperas and hyperopt libraries to improve the quality and speed of the selection. A single neuron of the output layer without the activation function and the relu activation function on the remaining layers were not changed during the topology testing due to the fact that we are solving the regression problem. The following parameters were used to train the network: optimizer: rmsprop, loss function: mean squared error (MSE), and metric: mean absolute error (MAE). The number of fully connected layers was taken incrementally, starting from 2 layers, until the classification quality of the best model within one iteration of the selection of parameters ceased to increase. The number of neurons on each layer was selected from the set [8,16,32,64,128,256,512,750,1024,1200,1500,2048,3000]. After each fully connected layer, a dropout layer was used with a continuous range of values from 0 to 0.5 for regularization purposes. The number of epochs was chosen from the set [10,25,50,100,150,200]. The batch size was chosen from the set [8,16,32,64,128,256]. The selection of the above parameters was based on the accuracy of the model on the validation sample.
After numerous experiments, the best performance was demonstrated by the network the topology of which is shown in figure 1. The resulting dropout values were 0.44, 0.18, 0.19 and 0.20. Training took 25 epochs with the batch size equal to 128. The accuracy of the network according to the MAE metric was 0.4043 on the validation set and 0.4048 on the test set.

Results and discussion
The data obtained in this study indicate that the functional connectivity of the brain at rest, estimated on the basis of EEG data, is related to the level of non-verbal intelligence. However, the results show a sufficiently large prediction error for the target variable. A special feature of the dataset is that one value of the target variable corresponds to several samples from our data, which makes it possible to develop a method for more accurate prediction of the level of intelligence for a unique person, each of which corresponds to 8 lines in our dataset. In this research, we used a simple model prediction averaging over each row corresponding to one unique person in the test sample. In this case, there is a small increase in the accuracy of prediction for a person. MAE was 0.4043 on the test set.
Finding a more suitable approach will be one of the priority areas for further research. The low prediction accuracy of the target variable is due to the small set of collected data for such a large number of features. The solution to this issue can be the generation of synthetic data using generative adversarial neural networks. The direction of further development of this study will be the development and training of such networks to solve the problem of a small training sample.