Automatic ECG-based detection of left ventricular hypertrophy and its predictive value in haemodialysis patients

Objective. Left ventricular hypertrophy (LVH) is one of the most severe risk factors in patients with end-stage kidney disease (ESKD) regarding all-cause and cardiovascular mortality. It contributes to the risk of sudden cardiac death which accounts for approximately 25% of deaths in ESKD patients. Electrocardiography (ECG) is the least expensive way to assess whether a patient has LVH, but manual annotation is cumbersome. Thus, an automated approach has been developed to derive ECG-based LVH parameters. The aim of the current study is to compare automatic to manual measurements and to investigate their predictive value for cardiovascular and all-cause mortality. Approach. From the 12-lead 24 h ECG measurements of 301 ESKD patients undergoing haemodialysis, three different LVH parameters were calculated. Peguero-Lo Presti voltage, Cornell voltage, and Sokolow–Lyon voltage were automatically derived and compared to the manual annotations. To determine the agreement between manual and automatic measurements and their predictive value, Bland–Altman plots were created and Cox regression analysis for cardiovascular and all-cause mortality was performed. Main results. The median values for the automatic assessment were: Peguero-Lo Presti voltage 1.76 mV (IQR 1.29–2.55), Cornell voltage 1.14 mV (IQR 0.721–1.66), and Sokolow–Lyon voltage 1.66 mV (IQR 1.08–2.23). The mean differences when compared to the manual measurements were −0.027 mV (0.21 SD), 0.027 mV (0.13 SD) and −0.025 mV (0.24 SD) for Peguero-Lo Presti, Cornell, and Sokolow–Lyon voltage, respectively. The categorial LVH detection based on pre-defined thresholds differed in only 13 cases for all indices between manual and automatic assessment. Proportional hazard ratios only differed slightly in categorial LVH detection between manually and automatically determined LVH parameters; no differences could be found for continuous parameters. Significance. This study provides evidence that automatic algorithms can be as reliable in LVH parameter assessment and risk prediction as manual measurements in ESKD patients undergoing haemodialysis.


Introduction
Left ventricular hypertrophy (LVH) is an enlargement of the left ventricular heart muscle.It is very common in patients with end-stage kidney disease (ESKD) and one of the strongest risk factors regarding all-cause and cardiovascular mortality (Testa et al 2010).LVH has a strong association to sudden cardiac death which accounts for approximately 25% of deaths in ESKD (Waks et al 2016).Echocardiography and magnetic resonance imaging are more sensitive and specific than electrocardiography (ECG) to assess whether a patient has LVH.Anyway, ECG is more convenient, less expensive, and readily available (Cuspidi et al 2012, Bacharova et al 2014, Afkhami and Tinati 2015).Recently, the Perguero-Lo Presti voltage has been described to have a higher sensitivity than those measures proposed by Cornell or Sokolow-Lyon for the ECG-based detection of LVH (Hancock et al 2009, Peguero et al 2017).The predictive value of manually derived ECG-LVH parameters in haemodialysis patients was recently reported (Braunisch et al 2022).
To facilitate fast, reliable, and outcome-associated ECG analysis automatization is key.Especially, in longterm ECG recordings algorithm-based ECG analysis is convenient since manual annotation and determination of LVH parameters is difficult and time consuming (Adam et al 2018).An algorithm for 12-lead ECGs annotation to calculate RR-interval series was previously developed, validated, and used in this study (Hagmair et al 2017).However, a comparison of automatically and manually measured ECG-based assessment of amplitudes to calculate LVH parameters for risk prediction is currently not available.Furthermore, it could be easily applied to the whole 24 h recording and therefore analyse fluctuations in amplitudes that can occur especially in haemodialysis patients, as observed previously (Braunisch et al 2022).
Thus, the primary objective of this study was to compare automatically and manually derived LVH parameters.Then, the automatically derived LVH parameters were examined towards their predictive value, again in comparison with manually-derived parameters, for all-cause and cardiovascular mortality in haemodialysis patients.

Study population
The data used in this work are from the ISAR-('rISk strAtification in end-stage Renal disease') study (ClinicalTrials.gov; identifier number: NCT01152892), which was conducted according to the Declaration of Helsinki and has been approved by the Medical Ethics Committee of the Klinikum rechts der Isar of the Technical University of Munich and of the Bavarian State Board of Physicians (Schmaderer et al 2016).In brief, inclusion criteria were age 18 years, dialysis vintage 90 d, and written informed consent (Schmaderer et al 2016).Patients were excluded in case of pregnancy, ongoing infection, or malignancy with a life expectancy 24 months (Schmaderer et al 2016).As previously described, 82 subjects had to be excluded due to low ECG quality (e.g.(motion) artifacts or (intermediate) signal losses in one or more leads needed for the calculation of the LVH parameters and determined by visual inspection), ventricular paced rhythm, or complete left or right bundle branch block.The recordings of 308 participants from the original study by Braunisch et al (2022) were used.Seven additional subjects had to be excluded for this study, because no raw ECG data needed for the automatic measurements were available, resulting in 301 included subjects (figure 1).

Electrocardiography
The data were recorded via a 12-lead 24 h ECG using the Lifecard CF digital Holter recorder (Delmar Reynolds/ Spacelabs Healthcare, Nuremberg, Germany).Recording started 5-25 min before a dialysis session and lasted for approximately 24 h.In Braunisch et al (2022), two distinguished timepoints (i.e.pre-dialytic and postdialytic) were used, but risk prediction was mainly based on post-dialytic measurements.Thus, for this analysis, the data of the post-dialytic measurement were used.Furthermore, as described previously (Braunisch et al 2022), the pre-dialytic measurement of a patient was used only if the post-dialytic measurement was unusable because of artefacts or signal loss in one or more leads.The amplitudes needed for comparison were measured manually by medical professionals.
For the automatic data evaluation, the AIT ECGSolver algorithm (Bachler et al 2013) was used.It can automatically detect the QRS complex, P-wave and T-wave in each of the 12 leads and classify the QRS complex as normal or ectopic.The single lead annotations are combined for the 12-lead ECG (Hagmair et al 2017).To calculate the R and S amplitudes, an isoelectric baseline is computed by creating an interpolation line from the QRS onset of the beat in question to the QRS onset of the following beat (Bachler et al 2013).The amplitudes for the LVH parameter determination were calculated by using the amplitude of the detected R peak, the local minimum of the S amplitude, and subtracting the baseline (figure 2).
To calculate the Peguero-Lo Presti voltage, S D , which is the amplitude of the deepest S-wave among all 12 leads, was determined first.The criterion is calculated by adding S D to the amplitude of the S-wave in the ECG Wilson chest lead V4 (SV4), i.e. S D + SV4 (Peguero et al 2017).The Cornell voltage was calculated by adding the amplitude of the R-peak of the augmented ECG limb lead aVL (RaVL) to the amplitude of the S-wave in chest lead V3 (SV3), i.e. aVL + SV3 (Casale et al 1987).For the Sokolow-Lyon voltage, amplitudes of the R-peak in chest lead V5 and V6 (RV5 and RV6, respectively) were compared and the larger of the two was added to the amplitude of the S-wave in chest lead V1 (SV1), i.e.SV1 + RV5 or RV6 (Sokolow and Lyon 1949).For each parameter, there is a specific threshold that, if exceeded, indicates the presence of LVH.The threshold for Peguero-Lo Presti voltage is 2.3 mV in women, and 2.8 in men (Peguero et al 2017).For the Cornell voltage, it is 2.0 mV in women, and 2.8 mV in men, and for the Sokolow-Lyon voltage it is 3.5 mV (Sokolow andLyon 1949, Casale et al 1987).

Data collection and laboratory measurements
Clinical data, as well as baseline demographic, were attained via dialysis protocols and medical records (Braunisch et al 2022).Before the dialysis session, blood chemistry was measured.The Charlson Comorbidity Index (CCI) (ranges between 0 and 21) as well as the Cardiovascular Mortality Risk Score (score ranges between  −11 and 39) were calculated to assess comorbidities and cardiovascular mortality risk, respectively (Charlson et al 1987, Anker et al 2016).Both scores assign numerical weights to conditions or domains, for instance heart failure, cancer, and diabetes (CCI), and age, body mass index, presence of a history of cardiovascular disease, and haemoglobin (Cardiovascular Mortality Risk Score).

Endpoints
The primary endpoint was cardiovascular mortality, while all-cause mortality served as secondary endpoint.These endpoints were classified by the ISAR Endpoint Committee by assessing medical records and databases of each dialysis centre, as well as by contacting the attending physician or the next of kin (Schmaderer et al 2016).

Statistical analysis
Data from continuous variables are presented as mean and standard deviation (SD) or median and interquartile range (IQR), as appropriate, and for categorial data as total number, frequencies, and percentages.The McNemar test for paired samples was used to detect differences in the categorial values between manual and automatic measurement.The agreement of the continuous measurements from the same heartbeat was visualized using the method of Bland-Altman and linear regression plots.Linear regression was assessed using the Pearson correlation coefficient.For the endpoints, Cox proportional hazards regression analysis was used in univariate form as well as by means of a model accounting for the Charlson Comorbidity Index and the Cardiovascular Mortality Risk Score and one with sex as additional covariate.Hazard ratios (HR) including their 95%-confidence intervals as well as the log-likelihood (log l) value, as a measure of the quality of the model fit, are presented.The log-likelihood values allow for a comparison of fitted models.Statistical significance was assumed at a 5% level.Statistical analysis was performed using Matlab R2019b (The MathWorks, Inc, Natick, Massachusetts).

Baseline characteristics of study participants
The study population included 301 patients with a mean age of 64.1 years (15.5 SD).There were 199 male and 102 female participants with a median dialysis vintage of 44.6 months (IQR 23.4-75.6),for details, see table 1.

Association with mortality
Of the 72 patients that died within the median follow-up time of 36 months (IQR 25.3-36.0),24 died of cardiovascular causes.There were also 24 patients that had a kidney transplant and 8 patients who were lost to follow-up.Both groups were censored at the last day of dialysis.
The Cox models (tables 2 and 3) show that the HR for all-cause mortality were similar for continuous and categorical values when comparing automatic to manual assessment, i.e. all were not significant and showed the same tendencies in univariate and multivariable analysis.Exemplarily, categorial Peguero-Lo Presti voltage sees no difference in HR for all-cause mortality in univariate analysis with 1.18 (0.69, 2.04), p = 0.54 in manual measurement and 1.18 (0.69, 2.02), p = 0.54 in automatic measurement.Similarly, HR for the continuous values changes from 1.12 (0.91, 1.38), p = 0.30 in manual measurement to 1.09 (0.90, 1.33), p = 0.36 in the automatic one.For cardiovascular mortality, again no clinically relevant difference was found between the automatic and manual measurement for almost all measures.Again exemplarily, categorial Peguero-Lo Presti voltage (manual: HR = 2.74 (1.22, 6.17), p = 0.015; automatic: HR = 2.53 (1.12, 5.70), p = 0.025) as well as the HR for the continuous values (manual: HR = 1.53 (1.16, 2.02), p = 0.003; automatic: HR = 1.39 (1.07, 1.80), p = 0.012) show comparable values with similar significance levels.Only for the categorical Cornell voltage differences, although with same trends, are visible (manual: HR = 2.72 (1.02, 7.29), p = 0.046; automatic: HR = 2.10 (0.72, 6.15), p = 0.18).The log-likelihood value for cardiovascular mortality of the Cox models also indicates that there are no relevant differences between manual and automatic measurements and their ability in risk prediction.In line with the HRs and the associated p-values for cardiovascular mortality, the log-likelihood values were, exemplarily, −129.12 for the manual and −129.51 for the automatic measurement for the categorial Peguero-Lo Presti parameter.Again, the loglikelihood values for all-cause mortality in the Cox model for categorial Peguero-Lo Presti voltage were very similar for manual (logL = −396.14)and for automatic (logL = −396.13)measurements.Association with mortality was similar after additional adjustment for sex (data not shown).

Discussion
The primary objective of this study was to compare automatically and manually derived LVH parameters based on 12-lead ECG recordings.Furthermore, the predictive value of these parameters for all-cause and cardiovascular mortality were investigated and compared.The main finding of this study is that automatic determination of LVH parameters is a possible alternative to manual measurements and thus can facilitate risk prediction.To our knowledge, this is the largest study to demonstrate that automatically and manually derived ECG-based LVH parameters can be used interchangeably as risk predictors in stable European haemodialysis patients.A study by Izumi et al from 2011 also shows similar results when comparing manual and automatic measurements of the QT dispersion as a predictive value for LVH in Japanese outpatients not on dialysis (Izumi et al 2011).The study by Izumi et al and our study show that the predictive value does not change depending on the workflow, i.e. manual or automatic evaluation.This finding is of special importance in haemodialysis patients, since the high-volume changes and consequent variations in impedance affect ECG measurements and amplitudes (Braunisch et al 2022) and thus allow a circadian assessment including dialysis and interdialytic intervals.The results underline the feasibility of automatic measurements also in haemodialysis patients, which generally represent a multimorbid patient cohort.Manual and automatic measurement of LVH parameters lead to comparable results, as can be seen in the Bland-Altman plot (figure 3) and the low, non-significant number of changes in LVH detections (figure 4).The hazard ratios for all-cause and cardiovascular mortality and their significance change only in categorial LVH detection for one LHV parameter (i.e. the Cornell voltage).It stays consistent for the remaining categorical values and for continuous values, regardless of whether ECG-based LVH parameters were measured manually or automatically (table 2 and table 3).All results were independent of sex.Although the values were often similar, there are 40 outliers visible in the Bland-Altman plot, which were analysed in detail by looking at every single case and the manual and automatic assessment.The outliers could each be placed in one of three categories.The first category, which consisted of 23 measurements, concerned inaccuracies in the manual measurement, including wrong measurements as the spikes of the leads overlapped in the graphical representation.Overlapping of ECG leads hampered to distinguish between the end of one and the begin of the other lead (see example in figures 5(A) and (B)).In other cases, the wrong lead was determined to be the S D (i.e. the amplitude of the deepest S-wave among all 12 leads) for the calculation of the Peguero-Lo Presti voltage amplitude, since differences in amplitudes were not visually distinguishable, or there were mistakes in the transfer of the measured values to the combined data sheet including all manual measurements, i.e. transcription errors (n = 4).For the improvement of manual measurements, the visual representation would need to be adopted automatically for each of the recordings according to the amplitudes (e.g. via a recording- specific zoom factor).Transcription errors are human and can happen, even though vigorous efforts and multiple checks have been done.Anyway, an automatic assessment and transferral of data can overcome these issues.The second category was mistakes in the automatic measurement, which was the case for 13 measurements.In these cases, the algorithm made a mistake in detecting the correct R or S peaks (see example in figures 5(C) and (D)), which lead to wrong values.Often the R spikes were too small to be detected or there was too much noise on the ECG signal.A solution for these mistakes might be an adaptation of the pre-processing and filtering process to pronounce the R peaks in the signal more.Furthermore, the introduction of a quality indicator for the automatic assessment based on the signal-to-noise ratio could be helpful.Anyway, a change of the pre-processing steps would mean a need for a re-validation of the original algorithm.The last four outliers occurred because an adjacent heartbeat had to be used for the automatic measurement, as the algorithm could not detect the necessary amplitudes at the heartbeat used for the manual measurement.Outliers in this category might be a potential issue for the direct comparison in this study, but not really for the general application of the automatic assessment, since the major benefit of the automatic assessment is the fact that one is not restricted to single heartbeats but can extend the analysis to multiple heartbeats over longer time-periods and thus make the assessment more robust to outliers by the possibility to average measures.Anyway, the number of mistakes was smaller in the automatic measurements (N = 13) than in the manual procedure (N = 23).These numbers support the idea that an automatic assessment might be more reliable, thus reducing human errors and increasing clinical validity.There were 13 cases where the categorial LVH detection changed between the manual and automatic measurements.Of these, 10 were outliers.For the three remaining cases, the values for the manual measurement were close to the threshold and the small difference in automatic and manual measurement led to a change in the categorial value.One must acknowledge that the sensitivity of ECG-based LVH parameters when compared to echocardiographic LVH is generally quite low with a quite high specificity (Hancock et  The similarities in the Cox models also indicate that the automatic algorithm can be used as an addition, if not alternative, to manual measurement of the LVH parameters for LVH detection and LVH-based risk prediction.The automatic assessment is especially helpful for 24 h measurements, which are inconvenient to evaluate in full length, as doctors and other medical personal often do not have the time to do it manually.The analysis of whole 24 recordings could be used to analyse fluctuations of amplitudes which could be especially present in haemodialysis patients as observed before (Braunisch et al 2022).These changes are most likely caused by the dialysis treatment itself (e.g.fluid removal or temporal changes in electrolyte status), activities of the autonomic nervous system, or posture changes, and not by actual electrophysiological alterations (Drighil et al 2008, Poulikakos and Malik 2016).The importance of future analysis of 24 or even 48 h ECG recordings is supported by findings by Marcantoni et al highlighting the circadian modulation of ECG indices, which was interrupted by dialysis (Marcantoni et al 2022).Furthermore, an automatic evaluation can facilitate standardized (i.e.limiting inter-and intra-operator variability) longitudinal (i.e.repeated measurements over short-to long-time periods from months to years) comparisons of LVH parameters.All this opens new possibilities for the perpetual monitoring of dialysis patients, the early detection of upcoming target organ damages as well as therapy support and control.Importantly, for a widespread clinical application, a limitation of the algorithm is the correct measurement in low amplitudes and peaks (e.g.small R-peaks) and in ECG signals with an insufficient signal-to-noise ratio, whereas a quality indicator could point to measurements where a manual review is recommended.Nevertheless, this study showed that an automatic algorithm can get similar values for the LVH prediction parameters Peguero-Lo Presti voltage, Cornell voltage, and Sokolow-Lyon voltage as a medical professional measuring manually.
The strengths of the current study are the large sample size with available manual annotations of LVH parameters, thus allowing a direct comparison between manual and automatic values, and a long follow-up time including numerous events.Furthermore, the sample represents a well-described cohort of stable European haemodialysis patients with no further preselection besides the mentioned exclusion criteria.With regards to limitations, being limited to mainly Caucasian patients from Munich and its suburban area, it was not possible to assess the effects of ethnicity.Therefore, it is unknown if the study results can be generalized to other ethnic groups.However, considering the results of Izumi et al, it can be assumed that other ethnic groups could have similar results, although different cut-off points might be necessary (Izumi et al 2011).Secondly, analyses were limited to the dialysis-on day as 24 h ECG recordings in the ISAR study were started just before the midweek dialysis session.Furthermore, the current comparison is just based on one single heartbeat in the whole 24 h recording and thus influences of haemodialysis could not be considered.An additional uncontrolled factor, but probably an advantage as well, is the fact that there was no further data cleaning, e.g.correcting for manual transcription errors or automatic misdetection of heartbeat, to represent a realistic scenario.
In conclusion, the results of this study clearly indicate that automatic measurement of LVH parameters is a possible alternative to manual measurements and that the risk prediction is similar.

Figure 2 .
Figure 2. Example of amplitude calculation for AIT ECGSolver algorithm.

Figure 3 .
Figure 3. Bland-Altman plots and scatter plots to depict differences between the manually and automatically measured ECG LVH parameters.A-C The graphs on the left side show the scatter plot of manual (x-axis) versus automatic (y-axis) measurements and the the line of identity; graphs on the right are the corresponding Bland-Altman plots.

Figure 4 .
Figure 4. Changes of LVH detection based on categorial values between manual and automatic measurements.Measurements were compared using the McNemar test.
al 2009, Bacharova et al 2014), and the recently presented Perguero-Lo Presti voltage is supposed to increase sensitivity (Peguero et al 2017).New approaches based on ECG using deep learning methods might improve LVH detection [e.g.(Khurshid et al 2021, Kokubo et al 2022)] and future work could focus on explainability, i.e. on how an expert can re-trace, interpret, and explain how a certain result has been achieved to enhance trust in deep learning methods (Holzinger 2021).

Figure 5 .
Figure 5. Example of outliers in selected ECG leads for two patients.In patient ID011 (A and B), manual assessment was wrong (A) and automatic measures correct (B); for patient, ID186 (C and D), manual assessment was correct (C) and automatic annotation wrong (D).

Table 1 .
Baseline characteristics.Countable data are expressed in absolute numbers and percent, data following a normal distribution are reported as mean (standard deviation), and otherwise as median [1st quartile, 3rd quartile].

Table 2 .
Univariate hazard ratios (HR) for all-cause and cardiovascular mortality including 95% confidence interval and log-likelihood (logL) for LVH parameters for the manual and automatic measurement.

Table 3 .
Adjusted hazard ratios (HR) for all-cause and cardiovascular mortality including 95% confidence interval and log-likelihood (logL) for LVH parameters for the manual and automatic measurement.Adjusted for Charlson comorbidity index and the cardiovascular mortality risk score.