QRS complex detection based on continuous density hidden Markov models using univariate observations

In the electrocardiogram (ECG), the detection of QRS complexes is a fundamental step in the ECG signal processing chain since it allows the determination of other characteristics waves of the ECG and provides information about heart rate variability. In this work, an automatic QRS complex detector based on continuous density hidden Markov models (HMM) is proposed. HMM were trained using univariate observation sequences taken either from QRS complexes or their derivatives. The detection approach is based on the log-likelihood comparison of the observation sequence with a fixed threshold. A sliding window was used to obtain the observation sequence to be evaluated by the model. The threshold was optimized by receiver operating characteristic curves. Sensitivity (Sen), specificity (Spc) and F1 score were used to evaluate the detection performance. The approach was validated using ECG recordings from the MIT-BIH Arrhythmia database. A 6-fold cross-validation shows that the best detection performance was achieved with 2 states HMM trained with QRS complexes sequences (Sen = 0.668, Spc = 0.360 and F1 = 0.309). We concluded that these univariate sequences provide enough information to characterize the QRS complex dynamics from HMM. Future works are directed to the use of multivariate observations to increase the detection performance.


Introduction
The electrocardiogram (ECG) signal reflects the electric activity of the heart and is employed as a noninvasive tool for cardiovascular system analysis and the diagnosis of cardiac pathologies. The ECG has a special morphology characterized by the P-wave and the QRS complex, which reflect the atrial and ventricular depolarizations, respectively, and the T-wave, which reflects the ventricular repolarization. In particular, the QRS complex has the largest amplitude of the ECG waves and its detection has a clinical interest due to the possibility to estimate statistical indicators to evaluate the cardiovascular status [1,2].
Taking advantage of the progress of computing power, different algorithms have been conceived for the automatic detection of the QRS complex. The diversity of the approaches implemented to develop accurate QRS complex detectors goes from digital filters to statistical methods such as wavelet transforms [3], artificial neural networks [4,5], genetic algorithms [6], mathematical morphology [7], hidden Markov models [2,8,9], among others [10][11][12]. For instance, algorithms based on the wavelet transform [3] exploit the possibility to analyze the ECG signal at different temporal and frequency resolution scales to distinguish the QRS complex which is  [4]. Currently, the performance attained from the implementation of these methodologies are pretty good but questions related with the influence of the signal noise and abnormal morphologies are still open.
A QRS complex detection algorithm based on continuous hidden Markov models is presented in this paper. The models are trained to characterize the dynamics of QRS complex and its derivative. Different number of hidden states are evaluated.

Hidden Markov model
An HMM [13] is a statistical model used to characterize signal dynamics as a function of time. A typical HMM is composed of n hidden states and the set of parameters λ = (A, B, π), where: A training stage is performed to obtain the set of parameters λ in order to maximize P (O train |λ), the probability that an observation sequence taken from the analyzed system, is generated by the model. Figure 1 depicts an example of operation of an HMM, where the observation sequence of length T is emit by a hidden state S j with probability b j (O k ). In addition, states transit with probability a ij .

ECG dataset
The approach was validated on ECG signals from the MIT-BIH Arrhythmia database [14]. This database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, sampled at 360Hz. In addition, annotations of the localization of QRS complexes were provided for each recording.

QRS complex detector based on HMM
The ECG signal was filtered with a pass-band (5 − 15Hz) 3rd order Butterworth filter. With the purpose to develop automatic detectors of QRS complexes based on HMM, a training stage exploding the properties of the QRS complex such as the amplitude, x(t), and the rate of change of amplitude, x (t) = dx dt was performed. These observations were extracted using an observation window of T = 55 samples (152.8ms) centered into each QRS complex annotation provided. In each case, from 40 ECG recordings, 50 QRS complexes were randomly taken from each recording. The resulting HMM, λ x and λ x , trained from x(t) and x (t), respectively, were analyzed for 2 ≤ n ≤ 4.
To begin the detection stage, it was estimated P (O test |λ), the probability that an observation sequence, for T = 55 samples (152.8ms), is generated by the model, where O test is extracted from the signals x(t) or x (t) and is slid along the signal with an overlapping of 98%, and λ can be λ x or λ x . Given the overlapping of the sliding window, we performed the test on the first 14 min of the ECG signal. The P (O test |λ) and a fixed threshold β were compared to determine the binary measure of the detector. As λ x and λ x were adjusted to maximize the probability to obtain a sequence x(t) and x (t), respectively, then it was stated that if logP (O test |λ) ≥ β a detection of a QRS complex was achieved, where logP (O test |λ) is the log-likelihood of observations. A 6-fold cross-validation was implemented to avoid overfitting. Figure 2 shows the different phases of the detection approach. First, the models are trained in the training phase in order to characterize the dynamics (x QRS or x QRS ). Once the models are trained, they are used to compare log P (O 1 |λ) with a threshold on a detection phase. In this phase, we use a sliding window to get the observation sequence along the ECG signal. Finally, annotations and detections are compared in order to obtain the detection performance.

Performance evaluation
The statistical indicators of each detector were estimated contrasting the measurements of the implemented detector with the professional annotations, allowing to quantify true positives (T P ), true negatives (T N ), false negatives (F N ) and false positives (F P ). Then, the performance of each HMM detector was evaluated using sensitivity, Sen = T P T P +F N , specificity, Spc = T N F P +T N and harmonic mean of sensitivity and specificity, F 1 = An optimal threshold, β optimal , was obtained from the analysis of the receiver operating characteristic (ROC) curve, Sen vs 1 − Spc. In particular, the distance from a point of the ROC curve to the perfect detection, = (1 − Sen) 2 + (1 − Spc) 2 and the area under the ROC curve (AUC), were quantified. Table 1 shows the results of the mean detection performance for each model, λ x and λ x . In bold the best results found. Despite the AUC was below 0.5, the lowest AUC was 0.387 for λ x . The best optimal thresholds for both detector are around −7, these thresholds provide an F 1 = 0.309 for λ x and F 1 = 0.293 for λ x . These results show that the QRS complex detection based on HMM using univariate observations, coming either from the ECG signal or its derivative, shows a lower detection performance compared with the state of the art.

Discussion
The highest values of AUC and F 1 score were obtained for the use of the ECG signal instead of its derivate. This can be possible due to the fact that the derivate of the signal could have enhanced the dynamics of P and T waves and bring them closer to the QRS complex dynamics, leading to an increase of FP. Using both signals, HMM based detectors show a higher number of FP compared with FN (sensitivities were greater than specificities). A reason could be that only QRS complex dynamics were characterized by the model (P and T waves were not taken into account), thus P and T waves could have been confused by QRS complex leading to higher log-likelihood of observations.
The length of the ECG recording being tested is approximately half of the total length, which is a limitation of this paper. This decision was taken given the amount of time involved in the test phase. Future works are directed to the reduction of the overlapping of the sliding window, the characterization of multivariate observations by the HMM, the inclusion of a refractory period to reduce the number of FP, the use of two models to characterize two distinct dynamics, and the use a relative threshold instead of a fixed one to compare the log-likelihood of the observation sequence given the model.

Conclusion
Using univariate observations coming from the ECG signal, an HMM was considered to characterize the dynamics of the QRS complex, either from the ECG signal or from its derivative. Our QRS complex detector is based on the comparison of the log-likelihood of an observation sequence given an HMM with a fixed threshold; the best detection performance was Sen = 0.668, Spc = 0.360 and F 1 = 0.309 using 2 states and the ECG signal.