Assessment of QRS and Q-T segments efficacy as non-invasive biomarkers for sudden cardiac death forecasting

Sudden cardiac death (SCD) is a critical event occurring within an hour of sudden cardiac arrest (SCA). SCA often arises from disruptions in cardiac electrical signals, leading to fatality by hindering blood circulation. SCD, a significant contributor to cardiovascular-related deaths, impacts millions people globally. Most studies in the literature focus on heart rate variability (HRV) as a biomarker for predicting SCD while marginalizing other ECG morphological features. This study strives to assess and compare the QRS and Q-T efficacy as non-invasive biomarkers to predict SCD. The study aims to examine the QRS and Q-T segments of the ECG signal as potential biomarkers for predicting SCD effectively. The process involves selecting ECG segments from international databases, followed by preprocessing, delineation, empirical mode decomposition (EMD), and median frequency (MDF) feature extraction. Machine learning classifiers, namely support vector machine (SVM) and random forest (RF), are employed to classify SCD and normal sinus rhythm (NSR) classes based on the extracted features. The results underscore the superiority of the Q-T segment, with SVM achieving the best classification performance (accuracy = 83.88%, sensitivity = 90%, specificity = 77.77%). This suggests that the Q-T segment holds the potential to predict SCD better than the QRS segment.


Introduction
Sudden cardiac death (SCD) is unexpected sudden death that occurs within one hour of the beginning of the symptoms caused by sudden cardiac arrest (SCA) [1,2].SCA is mainly due to the disruption of electrical conductivity in the heart.Due to this disruption, the heart can not pump blood to the body organs and leads to death [2,3].It is responsible for half of heart diseases deaths [4].It is reported that SCD has around 350,000 victims in the USA and millions worldwide annually [1,5] without any obvious and diagnosable symptoms [6].As for Malaysia, there is no exact number for SCD since they classify it under cardiovascular diseases (CVD).In 2020, CVD was one of the top five causes of hospitalizations and accounted for 17% of total mortality in Malaysia [7].
There are several preliminary research works have attempted to predict the SCD using HRV and achieved a maximum prediction rate of 91.23% and predicted the SCD one min before its occurrence [8].In 2017, the researchers have predicted SCD using HRV signals which are five minutes (min) before its occurrence with a maximum prediction rate of 92% [9].In 2019, researchers have utilized the features from HRV as biomarker to predict SCD and got a maximum mean prediction rate of 83.33% (10 min) [10].In 2023, a study proposed a new approach to predict the SCD 30 min before its occurrence by using HRV as a biomarker [11].No research work in the literature reported an approach to predict the SCD more than one hour before its occurrence using the electrocardiogram (ECG) signal only [12].Hence, a further investigation on analyzing the ECG signal and its morphological features would enable us to improve the SCD prediction time with a maximum value of sensitivity and specificity.Indeed, there is no proper guideline for assessing high-risk patients if their heart disease history is unavailable.There are few and limited studies focused on developing risk markers for predicting SCD using either HRV or ECG signals in the literature while marginalizing other ECG morphological features such as QRS and Q-T.Researchers have focused on identifying and categorizing the cardiac risk patients based on clinical ratings but bio-signals based risk quantification remains unanswered in the literature [13].
Hence, this study aims to exam QRS and Q-T segments as a non-invasive and simple biomarkers based on the ECG signal for predicting the SCD.
In this study, we will exam two ECG morphological features, namely QRS segment and Q-T segment to find the best one that can predict the SCD one hour before its occurrence.The ECG signals were selected from the international databases and the length of the selected ECG signals are 30 seconds (sec).Then these ECG signals will be preprocessed and delineated to determine the QRS and Q-T segments.Then, the QRS and Q-T segments will be decomposed by using empirical mode decomposition (EMD).After that, the features will be extracted using the median frequency (MDF).Finally, two machine learning (ML) classifiers namely support vector machine (SVM) and random forest (RF) will be employed to classify the SCD and normal sinus rhythm (NSR) classes.The evaluation will depends on the classification accuracy (Acc), sensitivity (Se) and specificity (Sp).

Methodology
The methodology as shown in figure 1, we started by selecting the ECG signal segments from two online international ECG databases.The selected ECG signals were in length of 30 sec.Then we preprocessed the selected ECG signals.After the preprocessing, the delineation process where applied to determine 3 the QRS and Q-T segments.Next, the EMD were used to decompose the QRS and Q-T segments separately.Then the MDF where extracted from the decomposed QRS and Q-T segments.Finally, two ML classifiers namely SVM and RF were employed to classify the two classes.

ECG Signal Selection
The ECG signals were selected from two different databases.The first database is the Massachusetts institute of technology (MIT) MIT-NSR database while the second one is the MIT-SCD database.These databases are hosted on Physiobank [14].A total of 38 ECG signals were selected from both databases and divided into two classes.The MIT-NSR database consisted of 18 records of 18 subjects.From this database, 18 ECG signals were selected, one signal of each record where these signals are forming the first class of the dataset (NSR class).MIT-SCD database consisted of 23 ECG records but only 20 records can be used because the other three records have the absence of ventricular fibrillation and were neglected.From the MIT-SCD database, 20 ECG signals were selected one hour prior to the SCD onset.The selected signals from both databases were 30 sec in length.It is noteworthy that the ECG records of both databases are consisted of two ECG leads, but the selected ECG segments were selected from the first lead only.

Preprocessing
Preprocessing is important before analyzing a bio-signal.ECG signal is a non-stationary signal and it can be affected by many factors like power line noise (50 Hertez (Hz) / 60 Hz), body muscles noise [15,16], breathing noise [17] and high-frequency noise [18].Therefore, preprocessing is required to reduce the disparity in the ECG signals and make them uniform by removing the noise and normalizing the signals' amplitude.The preprocessing of the ECG signal in this study involves three stages, namely ECG baseline wandering removing, filtration and amplitude normalization.The baseline wandering removing was carried out by using stationary wavelet transform (SWT).This approach was proposed and explained by [19] and [20].After that, a band-pass finite impulse response (FIR) filter with cut-off frequencies 0.5 Hz and 40 Hz with 10 th order was employed to remove other types of noise from the ECG signal [21].Finally, the amplitude normalization was performed to make all ECG signals amplitude between +1 and -1 [22].

ECG Delineation
In this study, the MATLAB toolbox called ECGkit [23] was used to delineate the ECG signal to extract the QRS and Q-T segments from it.The first step in the ECG delineation is finding the QRS complex, for that purpose, Pan and Tompkins algorithm was employed [24].It is a good and accurate algorithm since it based on the peak energy to find the QRS complex [25].Then, by using other ECGkit built-in software called WFDB developed and supported by Physiobank [14] to find the onset and offset points of QRS and Q-T waves.By finding all required peaks, onsets and offsets the QRS and Q-T segments were determined.Finally, the determined QRS segments of each ECG signal were gathered in one vector to perform one signal and same thing to the Q-T segments.

Empirical Mode Decomposition
EMD is a multiresolution data-adaptive approach that is used to decompose a signal into meaningful components proposed by [26].EMD often is used for a non-stationary and nonlinear signal analyzing by partitioning them into multiple components at different resolutions.The common EMD applications are in bio-signals analysis, power signal analysis, and seismic signals.EMD usually is used to perform the time-frequency analysis while remaining in the time-domain.The produced components by the EMD are called intrinsic mode functions (IMFs) [27].In this study, the QRS and Q-T segments were decomposed in its IMFs.Only the first two IMFs were used to extract the features from them using MDF.

Median Frequency
MDF is a frequency at which the decomposed QRS segments or Q-T segments power spectrum is divided into two regions with equal amplitude [28].In this study, the MDF was applied after decomposing the QRS segments or Q-T segments by using the EMD.In figure 2 the approaches of extracting the features using MDF from the IMFs were illustrated.The extracted MDF features were fed to the ML classifiers into three scenarios.The first scenario is using the MDF features from the first IMF.The second scenario is the MDF features from the second IMF while the last scenario is the MDF features from both IMFs gathered in one features vector.

Classification
A common task of ML algorithms is to identify objects and separate them into classes.This procedure is called classification, and it helps us separate ample quantities of data into discrete categories.In biomedical, machine learning algorithms are commonly used as computer aided tools for identifying or predicting pathologies.In this study, after extracting the features by using MDF from the QRS and Q-T segments, two ML algorithms were employed.These ML algorithms are SVM and RF.These ML classifiers were used separately to classify the NSR and SCD classes.
SVM is an ML classifier is a new statistical learning algorithm that has shown promising performance comparing to the traditional ML algorithms in solving the problem of pattern recognition and classification [29].It is a kind of supervised learning algorithm that can be used for regression, classification and outliers detection [30].In this study, the SVM classifier was trained using the radial basis function (RBF) kernel [31] to classify the SCD and NSR classes.
RF is a learning approach that uses ensemble manner for classification, regression and other jobs that operates by creating multi-decision trees during the time of training [32].RF is an ML classifier that uses a set of individual decision trees that operate as a group.In classification, the prediction outcome of the RF classifier is the class voted by the majority of trees.In this study, the RF classifier was trained by using the Bagging ensemble algorithm [33].The number of trees was set to 30 since this number produced good results.Moreover, to reduce the number of the huge amount of results that could be produced by trying different numbers of trees and training algorithms.

Results and Discussion
Two morphological features (QRS and Q-T segments) were examined in this study to compare between them and find the best one that fit the developed approach to predict the SCD one hour before its occurrence.These two morphological features were decomposed by using the EMD separately.Two levels of IMFs were used to extract the features from them using MDF separately.The extracted MDF features were fed to the ML classifiers into three scenarios.The first scenario is using the MDF features from the first IMF.The second scenario is the MDF features from the second IMF while the last scenario is the MDF features from both IMFs gathered in one features vector.Finally, two ML classifiers namely SVM and RF where employed separately to classify the NSR and SCD classes.The results of the QRS segment are listed in table 1.These results are Acc, Se and Sp of the two ML classifiers for the three scenarios.It can be noticed from table 2 that the best results were achieved by the SVM classifier with the first IMF.The RF classifier achieved its best results with the both IMFs.The lowest results for both classifiers were with the second IMF.The SVM classifier performed better than the RF classifier with both morphological features.The best Q-T results (Acc = 83.33%,Se = 90% and Sp = 77.77%)are slightly better than best QRS results (Acc = 81.11%,Se = 90% and Sp = 72.22%).Therefore, the Q-T segment is better than QRS segment for predicting the SCD one hour before its occurrence.Since the presented study approach was designed to be simple to compare the QRS and Q-T segments' efficacy for predicting SCD as a preliminary medical screening approach, the achieved results could be considered good.

Conclusion
In this study, we managed to conduct a simple approach to examine the QRS and Q-T morphological features to determine the best one to predict the SCD one hour before its occurrence.These two morphological features were decomposed using EMD into two levels of IMFs.Then the MDF was applied to these IMFs into three scenarios.Then two ML classifiers were employed to classify the NSR and SCD classes.We relied on three classification parameters to identify the best morphological feature.These parameters are Acc, Se and Sp.Both morphological features showed good results but the Q-T segment is the best.The final conclusion is the Q-T segment better than the QRS segment as a noninvasive biomarker for predicting the SCD one hour before its occurrence.Additionally, the proposed

Figure 1 .
Figure 1.Block diagram of the proposed methodology.

Figure 2 .
Figure 2. The three scenarios of extracting MDF features from the IMFs.

Table 1 .
The results of QRS segment for the SVM and RF of the three IMFs.It is noticeable from table 1 that the best results were achieved by the SVM classifier with both IMFs.Also, the RF classifier achieved its best results with the both IMFs.The results of the Q-T segment are listed in table 2. These results are Acc, Se and Sp of the two ML classifiers for the three scenarios.

Table 2 .
The results of Q-T segment for the SVM and RF of the three IMFs.