CAD system for epileptic seizure detection from EEG through image processing and SURF-BOF technique

Epilepsy is one of the most debilitating neurological diseases that abruptly alters a person’s way of life. Manual diagnosis is a laborious and time-consuming task prone to human error. Therefore, automating this task by developing an intelligent system is necessary. Existing deep learning (DL) models require high training time, large datasets, and machines with more memory and processing power. In addition, owing to the black-box nature of DL models, no one can determine the features that the network prefers for classification decisions. To overcome these challenges, this study proposes an accurate, automatic, and fast-intelligent system for epilepsy detection using a computer-aided diagnosis (CAD) -two-dimensional machine learning (ML) framework. Existing ML models struggle to produce reliable and acceptable diagnostic results owing to the low amplitude and nonstationary nature of electroencephalograms (EEGs), particularly in clinical situations where environmental influences are almost impossible to eliminate. The proposed model was built using the Children’s Hospital Boston and the Massachusetts Institute of Technology dataset, and represents the first study that employs the speeded-up robust feature (SURF) bag of features technique for this application, which generates local features from spectrogram images of the respective one-dimensional EEG signal inputs. In addition, DL features were extracted from the spectrogram images for model performance comparison. Both features were used separately to train the ML classifiers. Implementing SURF offers fast computation and makes the model invariant to distortions, noise, scaling, and so on. Therefore, the proposed model is more suitable for real-time applications, and this ML framework provides an enhanced accuracy of 99.78% compared to the support vector machine-RBF classifier, along with 99.56% sensitivity, 100% specificity, and an error rate of 0.22%. The higher detection accuracy demonstrates the effectiveness of the proposed framework for medical disease diagnosis applications.


Introduction
Epilepsy affects all age groups.It is a major risk to life because of possible heart failure, lung and brain issues, and unexpected accidental deaths.Therefore, early diagnosis of epilepsy is necessary [1].More than 60 million people suffer from epilepsy and experience various forms of seizures.According to estimates, if epilepsy is adequately identified and treated with the aid of anti-epileptic medications, up to 70% of people with this condition may live seizure-free [2].
Frequent seizures are signs of brain epilepsy.Confusion, unusual gazing, and quick, abrupt, and uncontrollable hand motions are also seizure symptoms.Neurological examinations, blood tests, neuropsychological evaluations, and neuroimaging modalities were used to identify epileptic seizures.Specialty doctors have paid particular attention to neuroimaging methods.Electrodes are applied to the patient's scalp to acquire EEG recordings, which then capture electrical impulses generated by the brain.In EEG, voltage fluctuations due to the brain neuron ionic current are conducted to determine the bioelectric activity of the brain and gather physiological data from individuals experiencing epileptic seizures [3].The frequency and rhythm of brain activity are altered by seizures, and these signal recordings are affordable, easy to use, effective, and noninvasive.Several non-EEG-based techniques have been developed, including fMRI, NIRS, PET, and MEG.EEG is frequently employed as a base signal for identifying epileptic seizures [4].
Although the origin of the disease remains mostly unknown, if seizures are identified early, patients can be properly treated.Several challenges are associated with seizure detection using EEG signals.For example, EEG signals can vary greatly from one person to another (inter-patient variability), and even within the same individual over time (seizure pattern heterogeneity), and it can be difficult to distinguish abnormal patterns associated with seizures.Moreover, EEG signals can be affected by real-world scenarios such as noise and artifacts.Various artifacts, such as muscle activity, eye movement, or electrical interference (artifact contamination), can contaminate EEG recordings [5].Specialist expertise and experience are required to make a diagnosis using EEG signals based on seizure signal visual examinations captured during EEG sessions.However, it is time-consuming, costly, and prone to mistakes.In some instances, two independent specialists may provide drastically different assessments of the same EEG, leading to an incorrect treatment plans [6].Hence, a fully automatic, precise, real-time model is desirable.Traditional ML models struggle to provide a reliable diagnosis owing to their low amplitude, nonstationary EEG nature, and environmental factors.Moreover, DL models require longer training times, large datasets, and machines with higher memory and processing capabilities.In addition, owing to the black-box nature of DL models, no one can determine the features that the network prefers for classification decisions.Therefore, a CAD model that can address these challenges and be applied in real-time is required.The proposed model was trained with a local handcrafted SURF from spectrogram images of EEG data, as shown in figure 1.The features obtained by implementing SURF were invariant to scale, rotation, distortion, and noise.Moreover, in comparison to state-of-the-art methods, SURF is fast to compute without compromising performance [6].The robustness of the proposed model was investigated by experimenting with conventional ML classification of EEG with (i) local handcrafted features and (ii) deep features.In addition, direct DL classification was employed by retraining the networks to compare model performance.Among the models developed (with local features, DL features, and direct DL), performance measures were compared based on classification accuracy.
The primary contributions of this study are as follows: (1) This is a fully automatic, robust, and real-time applicable two-dimensional (2D) ML-based epileptic seizure detection CAD model that can detect one-dimensional (1D) EEG seizures with an enhanced accuracy.
(2) The proposed model represents the first study to employ the SURF-BOF technique for epilepsy detection applications, thus making it faster and invariant to distortion, noise, scale, and rotation.(3) The model overcomes the challenge of irregularity in data shape (signal classification to image classification), allows feature vectors of different sizes, does not require retraining, reduces training time, is appropriate for massive data, and improves the quality of classification.
Section 2 summarizes similar studies in the context of this study.The implemented datasets and techniques are presented in the third section.The planned framework flow is described in the fourth section, and the experimental findings, performance evaluations, and important comparisons are presented in the fifth section.Finally, the conclusions are presented in the last section.

Related works
It is difficult for medical professionals and researchers to detect EEG seizures in real time and to determine whether a patient's symptoms have changed.Previously proposed models were thoroughly reviewed, including their performance, limitations, and approaches preferred in the proposed model to overcome them.

Traditional approaches
The authors in [7] used real-time medical information from the Senthil Multispeciality Hospital in India and EEG from the University of Bonn in Germany for epilepsy detection using time-frequency domain features.Using the DWT, the signals were divided into six frequency subbands, from which 12 statistical functions were obtained.The top seven features were determined and then input into the ML classifiers.Six statistical parameters were used to evaluate the performance of the classifications.Different characteristics and classifier combinations have been discovered to yield different results, and this study was an initial attempt to identify the most effective feature set and classifier.The proposed model achieved an accuracy of 91.67%.However, a better accuracy is desired in medical systems.
An FPGA-based approach was presented in [8] to distinguish between generalized and focal epileptic seizure types, utilizing a feed-forward multilayer artificial neural network (ANN) architecture.Using the TUH Seizure Detection Corpus (TUH EEG Corpus) dataset, five key features were discovered using time-frequency analysis, CWT, and statistical analysis.When tested on an FPGA board, the accuracy of the proposed model is 95.14%.An MLPNN, WT, and STFT-based denoising were used in another study [9] to construct an epilepsy detection framework.Using the Bonn EEG dataset, they obtained an accuracy of 93.9% from CNNs and 97.2% from bidirectional LSTM.However, using RCE networks DWT features, another study [10] obtained a specificity of 98.77%.
In [11], the authors first used an empirical WT with Fourier-Bessel series expansion to decompose EEG signals.To further condense the feature space, entropy features were employed and sorted according to the p-values from the Kruskal-Wallis statistical test.The Bonn University EEG dataset was used, and ensemble classifiers were employed.Using extra-tree classifiers, they achieved an accuracy of 97.8%.To classify normal and epileptic signals, a novel method using an AANN is proposed in [12].With an accuracy of 96.45%, the improved OCSA performed well for epilepsy detection.In all these studies, there is still room for improvement in detection accuracy.
FFT, entropy, and approximate entropy features were extracted from EEG signals [13] to train conventional classifiers.With TUH data, FFT and SVM classifiers achieved 96% accuracy.In [14], the authors used ANN classifiers and obtained the same performance when they analyzed the outcomes using the SPSS tools.Because entropy measures provide unique features that are intrinsic and physiologically meaningful, another study [15] used fuzzy and distributional entropy.Fuzzy entropy is applied to EEG signals to quantify the complexity of brain-wave patterns.It is useful for detecting subtle changes in the brain activity associated with seizures.Distribution entropy focuses on characterizing the distribution of values within an EEG signal and provides insights into its statistical properties.From short-term EEG, this model obtained 93% accuracy in classifying normal from epileptic groups.Some epilepsy detection methods may reach local optima rapidly because they fail to completely consider the discriminative features of the EEG signals.Hence, the authors of [16] implemented time-frequency analysis, and IHHO with a hierarchical mechanism and achieved an accuracy of 99.06% on CHB-MIT data.

DL-based approaches
The authors in [17] employed the first method to calculate the contribution of the frequency component to the target seizure classification, making it possible to identify specific seizure-related EEG frequency components from baseline EEG measurements.The FT-VGG16 classifier exhibits the highest average accuracy of 99.21.Additionally, the EEG feature frequencies that most significantly improved the classification accuracy were determined using the SHAP analysis method.In [18], the signal was transformed into a PSDED using a deep CNN and transfer learning techniques to perform automatic feature extraction from PSDED.Subsequently, the epileptic states were classified into seizure, interictal, preictal duration up to 30 min, and preictal duration up to 10 min, with an average accuracy of 90%.
The authors of [19] proposed a cloud-fog integrated smart neurocare model that uses a DL model for epilepsy detection to perform a temporal analysis of EEG input.Using a patient-independent method, real-time seizure detection in fog layer devices was accomplished with computational efficiency using single-channel EEG inputs.For EEG segments of 30 s duration from the CHB-MIT dataset, an optimal accuracy of 96.43% was achieved.In [20], the authors proposed a method for predicting seizures instead of detecting them.A DL model with iEEG recordings has been proposed to predict epileptic seizures.The iEEG signals were then filtered and segmented.With frequency-domain transformation, these segments were again divided into eight separate spectral bands (delta, theta, alpha, beta, and gamma sub-bands).Subsequently, the mean amplitude and band power features of each band were used to train the CNN and LSTM models.The CNN model achieved an accuracy of 94.74%.
Most of the abovementioned methods use EEG signals without any processing in the DL network.However, in [21], three distinct DL architectures were assessed for a range of preprocessed and merged EEG signals.They employed simple, moderately complex, and complex 1D DL architectures and evaluated them with varied inputs such as original EEG, standardized, original combined with squared, differentiated, and fast Fourier transforms.They achieved good results (99.13% accuracy) with fewer computational resources.Another study that focused on seizure prediction [22].The authors chose a patient-generic approach to eliminate the problem of variance in seizure characteristics over time, across individuals, and among differences in the duration of different seizure stages.They employed a hybrid feature space created using a number of feature augmentation techniques to address the nonlinearity of epileptic seizures and achieved 94.69% accuracy.There have also been some studies based on autoencoders.For example, one study [23] described an intelligent DCSAE-ESDC model.For the best choice of feature subsets, this approach develops a new feature selection method based on the COA.Additionally, a DCSAE-based classifier was developed.Finally, the Krill-Herd algorithm is used to adjust the parameters of the DSCAE model.They achieved a maximum accuracy of 98.67%.Similarly, another recent study [24] employed a deep convolutional autoencoder and bidirectional long short memory for epileptic seizure detection (DCAE-ESD-Bi-LSTM) for the same task, and achieved more accurate (99.8%) and optimized results (99.9% precision and 99.6% F1 score).A sparse autoencoder with a swarm-based DL method known as (SASDL) employing PSO, was proposed, which achieved an accuracy [25] of 98.5%.
In addition to increasing classification accuracy, it is important to reduce the computational complexity of CAD systems.This has recently been achieved [26] through an approach that uses built-in deep EEG data analysis for normalization.Avoiding the feature extraction process helps reduce computational complexity.This model achieved 96.99% accuracy with the CHB-MIT data.An entirely different approach obtained 98.49% specificity in another study, in which [27] a Bi-GRU neural network was used.The input EEG signal was preprocessed using a WT, and the output was postprocessed by moving average filtering, threshold comparison, and seizure merging.
In [28], a random selection and data augmentation technique along with a stacked 1D-CNN model was proposed for seizure onset detection.They achieved a 99.54% accuracy at the segment-based level for the CHB-MIT sEEG dataset.All of these DL-based methods produced more accurate results than traditional approaches.Nonetheless, the underlying criteria for decision-making remain unknown.

Combined approaches
In the aforementioned studies, either handcrafted or DL were used to develop the model.However, in some studies, both have been combined.For example, a new method for automatically recording epileptic EEG signals has been tested using the Bonn dataset [29].It was developed using repeated mixed quantification analysis and approximated entropy and was later combined with a CNN.The findings showed that recurrence quantification and approximation entropy are useful for detecting epileptic seizures, with sensitivity, specificity, and accuracy of 92.17%, 91.75%, and 92.00%, respectively.By combining it with a CNN, the evaluation performance improved by 98.84%, 99.35%, and 99.26%, respectively.Similarly, in [30], EEG signals were preprocessed using a Butterworth filter, and features were extracted from the CNN.By employing mutual-information-based estimators on the features, this method achieved an accuracy of 98% with the CHB-MIT data.In addition to the above studies, some recently reviewed studies, along with the data, methodology, and limitations, are given in table 1.
From the literature review, several classification models that detect epilepsy from EEG signals have been identified.However, these models have some limitations, such as low classification efficiency, high training time, requirement of large datasets, machines with more memory and processing power, unknown underlying decision criteria, higher model parameter count, and overfitting.Additionally, the low amplitude and nonstationary characteristics of EEG make it difficult to produce accurate and acceptable diagnostic results, especially in clinical settings where ambient influences are nearly impossible to completely exclude.These limitations are addressed using a new precise model that is applicable to real-time CAD systems, where the SURF-BOF technique generates invariant local features with faster computation.DL features were also extracted from ResNet, AlexNet, and EfficientNet using transfer learning to illustrate model comparisons.

Materials and methods
The datasets and approaches implemented to create the proposed framework are presented in detail in this section.

Dataset
The CHB-MIT dataset [37], which includes 1D EEG signals of both seizure and healthy patients, was preferred for the development of the proposed model.The CHB-MIT formed this EEG dataset, which is available to the general public on the PhysioNet server.Compared to other available datasets, the CHB-MIT dataset is larger, more realistic, and acts as the benchmark dataset for seizure detection tasks [38].CHB-MIT has recorded seizures, including clonic, tonic, and atonic seizures, in all regions of the brain.The diversity of patients and the different types of seizures contained in this dataset make it a perfect resource for the development and evaluation of the effectiveness of automatic seizure detection techniques in practical contexts [39].
A total of 844 h of scalp EEG recordings with 173 seizures were included in the dataset.There were 23 cases and 22 patients in the dataset (one patient had two recordings, 1.5 years apart):17 females between the ages of 1.5 and 19, and 5 males between the ages of 3 and 22.Following the discontinuation of anti-seizure medication, the patients were observed for several days to define their seizures and determine whether they were candidates for surgery.Each patient had a special folder with a folder synopsis and a 1-4 h EEG (EDF format).These folders also included information about the start and finish times of seizures in various EDF files.The dataset contains several artifacts, including eyeballs and muscle movements.All these artifact-prone channels have been mentioned by Wu et al [40].On this basis, 15 EEG channels (0-40 Hz), including F7-T7, T7-P7, P7-O1, F3-C3, C3-P3, P3-O2, F4-C4, C4-P4, and P4-O2, were selected for additional investigation and converted into rhythmic spectrograms using STFT to produce a dataset that was balanced and constant in time and length.To obtain the final dataset, 105 frames from chb01, 30 from chb02, 90 from chb05, and 75 from chb05 were independently obtained from seizure and ictal files.To expand the quantity of the dataset, data augmentation, including rescaling and horizontal-vertical flipping, was performed [41].Rescaling was applied to obtain the same size for all spectrogram images prior to the training.Furthermore, flipping with a probability parameter of one generated additional image to obtain sufficient training images.

SURF features
Here, the local feature, SURF [42], is preferred as part of the handcrafted feature extraction for model development because it provides information about each keypoint in the image and is robust against noise, distortions, and scale space invariance.SURF is appropriate for real-time applications [43,44] because it offers fast computation [45] owing to the use of integral images and box filters.
Feature extraction and description are the two key phases of SURF.The descriptor describes the Haar wavelet response distribution around the key point, whereas the detector is based on a Hessian matrix.The first reproducible orientation was fixed with data from the circular region around the keypoint, and a square region centered on the keypoint was built and oriented in accordance with the previously fixed orientation.Subregions of size 4 × 4 are created within this region.For each sub-region, several basic features were calculated at 5 × 5 evenly spaced sample points.A 4D descriptor includes ∑ dx, ∑ dy, ∑ |dx| , and ∑ |dy| .where dx and dy denote the Haar wavelet responses in the horizontal and vertical directions, respectively.The number of keypoints varies depending on the details of each image.The descriptors for each keypoint were used as features.Therefore, before training, the descriptors of each training image were concatenated vertically.In general, it is considered as 'N' keypoints for the entire training images.For N keypoints, N × 64 feature descriptors were obtained by implementing SURF.In this manner, these local features are appropriate for distinguishing spectrogram variations of seizure and nonseizure categories.

Evaluation criteria
In this study, the accuracy metric measures the number of correct predictions made by the model from among the total predictions (equation ( 1)).Positive predictive value (PPV) is the probability that a patient for whom the model predicted 'seizure' actually has a seizure (equation ( 3)).Whereas negative predictive value (NPV) is the probability that a patient for whom the model predicted 'nonseizure' will truly not have seizure (equation ( 5)).The ability of the model to detect seizures and nonseizures was evaluated using sensitivity (equation ( 2)) and specificity (equation ( 4

Proposed methodology
The proposed ML framework, which performs automatic epileptic seizure detection as illustrated in figure 2 will aid in the diagnosis and early treatment decisions.This 2D ML framework represents the first study to use the SURF-BOF technique to detect epileptic seizures.To make the model applicable to real-time CAD systems, SURF [46] was implemented.SURF provides faster computation and is invariant to distortion, noise, scale, and rotation, which are key requirements for real-time systems.Because SURF extracts local features, the specific patterns or structures in the spectrogram that may be indicative of seizure activity can be precisely identified.The classifiers were trained using both DL and local handcrafted features, and their performances were compared to develop an improved and more robust ML model.The CHB-MIT dataset [37], which is one of the largest and most diverse publicly available datasets, is preferred for the proposed model development.These characteristics make the CHB-MIT dataset suitable for seizure detection model development and evaluation [39].

Preprocessing
Because DL networks require fixed data shapes as inputs, irregularities in signal data shapes are a fundamental problem in EEG signal processing.This shape problem can be resolved by using an image-based spectrogram processing method.To create 2D spectrogram images of patients with seizures and healthy participants, EEG signals were first transformed using STFT.The multichannel EEG signals were recorded in Excel format.Each column represents the EEG signal channel.Each column acts as an input to generate the spectrogram.For patients with the EEG dataset numbers CHB01, CHB02, CHB03, and CHB05, one-sided, non-overlapping, and time-variable rhythmicity spectrograms were created using the STFT method [41].To focus on the signal properties at a particular point in time, EEG signals were split into shorter segments by windowing.The STFT of the signal was obtained by windowing and the DFT of each window.The STFT of the time-series data can be calculated from One of the benefits of STFTs is that the parameters have physical and intuitive interpretations.For example, in the case of the parameter window size, a larger window size provides more frequency detail but may lose temporal resolution, whereas a smaller window size provides better time resolution but may lose frequency detail.Similarly, a smaller hop size provides a more densely sampled time-frequency representation, whereas a larger hop size results in a sparser representation.The spectrograms were obtained using the 'scipy.signal.spectrogram'function with parameters: a sampling frequency of 256 Hz, window size of 1 s, and a Tukey window with a shape parameter of 0.25 [47].In contrast to nonseizure, EEG ictal spikes and frequency variations of seizure situations in the dataset are sudden and frequent.With low rhythmicity denoted by the yellow band and high rhythmicity shown by the dark blue/purple band, spectrograms can be used to evaluate EEG rhythmicity at various frequencies and to identify such rapid shifts.The STFT provides a visual representation and knowledge of spectral complexity as well as frequency components in the time-frequency domain.
The obtained image was 360 × 360 pixels.The spectrogram was cropped to preserve the pixel data between the spectrogram's top, bottom, left, and right.To increase the percentage of valid data in the input, this operation aims to remove irrelevant data and empty areas surrounding the spectrogram, including scale and coordinates.Hence, the final model input image was 280 pixels × 274 pixels after cropping.The ictal parts where the seizure began and terminated were referred to as seizure segments.Similarly, a file with no seizure occurrence was referred to as the nonseizure section.A total of 198 seizures were reported at a sampling frequency of 256 Hz in the original dataset.After data augmentation, this study had 300 seizures and 300 nonseizure spectrogram images.Other than the aforementioned modifications, no other processing steps were performed beforehand.The spectrogram images were then used for feature extraction.

Feature extraction
First, DL features are obtained by employing transfer learning.The use of transfer learning enables developers to avoid the requirement of large amounts of data.Using a pre-trained model can often result in a model that is more accurate and useful, as well as help speed up the training process using less computational power.DL features obtained by exporting fully connected layer activations, 'fc1000' (ResNet-50), 'fc8' (AlexNet), and 'efficientnet-b0|model|head|dense|MatMul' (EfficientNetB0), by giving the preprocessed spectrogram images.An Excel file containing these features was used as the input for the conventional ML classification.GS and ten-fold cross-validation were applied for hyperparameter tuning and to avoid overfitting.Finally, the classification performance measures of the PNN, DT, SVM, and KNN classifiers were evaluated.The best classification algorithm was then identified and used to train the local handcrafted SURF features.
To obtain local features from spectrogram images using the SURF approach, the keypoints from the image were first acquired, and then descriptors were obtained.The size of the descriptor array was N × 64 if one such image includes 'N' keypoints.Nevertheless, the number of features differs depending on the visual detail of each image.A new descriptor was added to the preceding image descriptor when each image was examined sequentially.

Classification
The excessive dimensionality of the final descriptor array prevents its direct entry into the classifier.To lower dimensionality and increase processing efficiency, it is necessary to arrange the features and transform them correctly.This was accomplished by using a BOF [48].K-means clustering techniques were employed to perform feature vector clustering after applying SURF.Minimizing the sum of the square distances between the cluster center and its member locations is the clustering criterion.Once the clustering is completed, a visual word dictionary composed of k vectors is created.Each SURF feature of an image can be represented by a visual word that can be identified in the dictionary.Subsequently, the BOF was used to train the best classification algorithm, which was finalized by comparing the model performances trained with DL features.
The proposed method permits the use of feature vectors of various sizes.Significant information loss occurs during the informative feature selection and quantization.Each sample has a variable number of features.All features taken from the training dataset must be retained to avoid loss.The fact that feature vectors do not need to be quantized is another benefit of this approach, as it helps improve the quality of classification.Direct DL classification was also employed by retraining the same networks preferred for DL feature extraction to compare model performance.

Experimental results and discussion
The detailed results of the suggested framework experiments are described in this section.The CHB-MIT dataset is preferred for model development, and a montage of the dataset spectrogram images (seizure and nonseizure) is shown in figure 3.
The 1D EEG signals of seizure and nonseizure patients and their corresponding spectrogram images obtained using STFT are presented in table 2. In addition, the illustration of seizure and nonseizure images belonging to the original data as well as after the augmentation steps (rescaling, horizontal flipping, and vertical flipping) are given in table 3.
Initially, DL features were extracted, and from its classification evaluation measure, the best conventional ML classification algorithm was identified for later experiments.Extracted DL features by exporting fully connected layer activations, 'fc1000' (ResNet-50), 'fc8' (AlexNet), and 'efficientnet-b0|model|head|dense| MatMul' (EfficientNetB0) by giving the preprocessed spectrogram images.The scatter plot of the seizure (red) and nonseizure (blue) data points used in transfer learning is shown in figure 4. The activations obtained from the first convolutional layers of the three networks are presented in table 4. The activations are displayed with the help of the 'imtile' function.64 images were displayed using an 8 × 8 grid for ResNet, with 64 channels in the first convolutional layer.However, AlexNet and EffificientNetB0 have only 62 and 60 channels, respectively, in the first convolutional layer.Blank spaces appeared because the same 8 × 8 grid was used to display the activations.An Excel file containing these features was then utilized as the input for conventional ML classification methods.GS and tenfold cross-validation were implemented, and the classification evaluation measures of the PNN, DT, SVM, and KNN classifiers from ResNet-50, AlexNet, and EfficientNet were acquired, as shown in tables 5-7.
Features from ResNet-50 were input into conventional ML classifiers DT, SVM, KNN, and PNN.The evaluation measures for both nonseizure and seizure classes are presented in table 5.The accuracy metric was chosen as the basis for performance comparison between the models.Both SVM and PNN trained with ResNet-50 features performed equally well in the seizure detection task.
Similar to ResNet-50, the features extracted from the fully connected layer of AlexNet were also fed into DT, SVM, KNN, and PNN.The evaluation measures for both nonseizure and seizure classes are presented in table 6.When considering the accuracy metric for performance comparison, the SVM trained with AlexNet features performed better than the PNN in terms of seizure detection.
The SVM classifier provided the best performance with features from EfficientNet, as shown in table 7. From these evaluation measures, it is evident that the SVM classifier performed well in the diagnosis of patients with seizures.Hence, next to the final accurate model, SURF features are fed into the SVM classifier because they offer fast computation and are invariant to distortions, noise, scale, and rotation.The SURF technique was applied with parameters 'MetricThreshold' (strongest feature threshold), 1000, 'NumOctaves' (number of octaves), 3,' NumScaleLevels' (number of scale levels per octave), 4, and rectangular ROI as [1 1 size(I,2) size(I,1)].SURF feature extraction was performed using the MATLAB (R2021b) function detectSURFFeatures().The number of keypoints in each image differs depending on the details present in each spectrogram image (seizure or nonseizure).Hence, the size of the SURF feature descriptor also varies for each image.Because tenfold cross-validation was performed, the images belonging to the training set varied during each fold, thereby increasing the number of features.Hence, it is difficult to calculate the number of features that belong to both seizure and nonseizure classes.
SURF features cannot be directly applied to the ML classifiers.Hence, the BOF technique was applied first.In the BOF technique, k-means clustering is commonly used to quantize local feature descriptors into   Training the SVM classifier ('rbf ' kernel) using the BOF produced the best classification evaluation measures given in table 8. Hence, the model achieved a classification accuracy of 99.78% for the diagnosis of seizures.It also produced confusion metrics and ROC curves, as shown in figures 6 and 7, respectively, along with an AUC of 1.The total number of cases was divided into tenfold and one-fold for each time selected for testing.As shown in figure 6, 225 nonseizure and seizure cases were present during the testing stage.In the nonseizure category, 224 cases were correctly classified as nonseizure, and one case was incorrectly classified as seizure.Furthermore, for seizure class, all 225 cases were correctly classified.
The ROC curve was close to the top-left corner of the ROC space because the model's performance was perfect (AUC = 1).The model correctly classifies positive and negative instances while minimizing false positives.
To perform a comparison with direct DL classification, the spectrogram images were fed into three DL networks: ResNet-50, EfficientNetB0, and AlexNet, and these networks were retrained.The network training plot is shown in figure 8.The network training parameters, number of epochs, initial learning rate, and minimum batch size were fine-tuned by applying GS, and the optimum values obtained were 20 epochs, batch size of 8, and learning rate of 0.001.The final performance matrices are presented in table 9.
The best classification accuracy obtained by DL classification was 97.22% from AlexNet.DL classification by transfer learning requires more training time (6-7 min) than training conventional ML classifiers with DL features (5 min).The training of the proposed model was performed with a minimum training time of 3 min.99.78% was obtained with the SURF-BOF-SVM classifier, which performed better than the DL networks for the classification between seizure and healthy patients.Hence, the proposed model is precise, robust, computationally fast, and applicable to real-time CAD systems.Because spectrogram images are used, the usual struggle of ML models to produce reliable results owing to their low amplitude and nonstationary EEG nature is eliminated.The proposed model is generalizable because an accurate detection model was developed using diverse datasets that can capture distinct patterns in the data, and the training stage was conducted using cross-validation.Even though the model is generalizable, the training time cannot be considered less important.In several real-world applications, models must be periodically retrained to   also allows feature vectors of different sizes, requires no retraining, reduces training time, is appropriate for massive data, and improves classification quality.Because DL is not involved in the finalized model framework, the limitations usually found with such networks are omitted here.The STFT supported ML classification is compared to other recent studies (table 10), which are based on the same dataset.The proposed model achieves the lowest error rate of 0.22%.It performed better than networks that are usually suitable for time series data, such as LSTM and GRU.Hence, it will aid medical professionals in making precise diagnoses quickly.The issue is that there are currently no publicly accessible large epileptic seizure datasets for extensive validation of the suggested DL/ML-based models for epilepsy detection and classification, which are applicable in real time.In future work, an ensemble of DL networks will be tried that is invariant to real-time variations, performs early disease diagnosis, and compares its performance with the proposed SURF model.Cluster-based phase space density feature + DL (seizure prediction) Sensitivity = 94.94%,specificity = 94.94% [27] Bi-GRU network Accuracy = 98.49% sensitivity = 93.89%,specificity = 98.49% [34] 1D CNN + Bi-LSTM Accuracy = 0.968, precision = 0.969, sensitivity = 0.968 [36] 2D deep convolution auto encoder + Bi-LSTM Accuracy = 0.987, sensitivity = 0.987, specificity = 0.988 Proposed work EEG-STFT-SURF-BOF-SVM Accuracy = 99.78%,sensitivity = 99.56%,specificity = 100%, precision = 100%, recall = 99.56%F-measure = 99.78%,error rate = 0.22%

Conclusion
study proposes an automated CAD framework for accurate epilepsy diagnosis that is applicable in real time.The use of SURF in the framework is invariant to scale, rotation, distortion, noise, etc., and has a short computation time.The BOF technique aids in training with SURF descriptors, as feature reduction cannot be performed.This 2D ML framework uses STFT spectrogram images for 1D EEG seizure detection, overcoming data shape irregularities and DL limitations.This approach eliminates the need for high training time, large datasets, large processing power, and memory requirements.It also avoids environmental influences and produces reliable diagnostic results in clinical settings.DL features were also extracted using transfer learning to compare the model performance.The final model provided an enhanced accuracy of 99.78% compared to the SVM-RBF classifier, along with 99.56% sensitivity, 100% specificity, and an error rate of 0.22%.Automated epileptic seizure detection systems can accelerate diagnosis and help patients make early decisions to undergo surgery, possibly improving their quality of life.

Figure 3 .
Figure 3. Montage of dataset spectrogram images of seizure and nonseizure classes.
visual vocabulary words or 'codebook' centroids.The codebook centroids were then used to represent and categorize the images based on the distribution of local features.First, random initialization of the cluster centers was performed.Next, based on the Euclidean distance, each local feature descriptor extracted from the image is assigned to the closest cluster center.The cluster centers were updated by determining the mean (centroid) of all feature descriptors assigned to each cluster after all feature descriptors were assigned to clusters.New codebook centroids are formed from these updated cluster centers.These two steps are repeated iteratively until convergence is achieved[49].The number of clusters in k-means clustering was

Figure 6 .
Figure 6.Confusion metrics of the final model.

Figure 7 .
Figure 7. ROC plot of the final model.

Table 1 .
Review of some recent epileptic seizure detection work.

Table 2 .
EEG signal and corresponding spectrogram visualization.

Table 3 .
Illustration of original and augmented images.

Table 4 .
DL network activations of the first convolutional layer.

Table 5 .
Performance metrics from conventional ML classifiers with ResNet-50 features.
adapt to changing data distributions or requirements.Faster training times enable quicker model updates and deployments, thereby ensuring that the model remains accurate.The classification evaluation measures clearly depict the capability of the proposed model and the effectiveness of the techniques implemented at each stage of the framework for detecting epileptic seizures.It

Table 6 .
Performance metrics from conventional ML classifiers with AlexNet features.

Table 7 .
Performance metrics from conventional ML classifiers with EfficientNet features.

Table 8 .
Classification evaluation measure of the SVM classifier for SURF features.

Table 9 .
Classification evaluation measures of DL networks.