Spectral analysis of heart sounds associated with coronary artery disease

Objective. The aim of this study was to find spectral differences of diagnostic interest in heart sound recordings of patients with coronary artery disease (CAD) and healthy subjects. Approach. Heart sound recordings from three studies were pooled, and patients with clear diagnostic outcomes (positive: CAD and negative: Non-CAD) were selected for further analysis. Recordings from 1146 patients (191 CAD and 955 Non-CAD) were analyzed for spectral differences between the two groups using Welch’s spectral density estimate. Frequency spectra were estimated for systole and diastole segments, and time-frequency spectra were estimated for first (S1) and second (S2) heart sound segments. An ANCOVA model with terms for diagnosis, age, gender, and body mass index was used to evaluate statistical significance of the diagnosis term for each time-frequency component. Main results. Diastole and systole segments of CAD patients showed increased energy at frequencies 20–120 Hz; furthermore, this difference was statistically significant for the diastole. CAD patients showed decreased energy for the mid-S1 and mid-S2 segments and conversely increased energy before and after the valve sounds. Both S1 and S2 segments showed regions of statistically significant difference in the time-frequency spectra. Significance. Results from analysis of the diastole support findings of increased low-frequency energy from previous studies. Time-frequency components of S1 and S2 sounds showed that these two segments likely contain heretofore untapped information for risk assessment of CAD using phonocardiography; this should be considered in future works. Further development of features that build on these findings could lead to improved acoustic detection of CAD.


Introduction
Coronary artery disease (CAD) has been declining as a percentagewise cause of death in developed regions such as EU and USA. Nonetheless, it has been a growing cause of death globally and remains the primary cause of death worldwide. Approximately 9.4 million people died from CAD in 2016, accounting for 16.8% of all deaths (World Health Organization 2018).
CAD has been decreasing as a cause of death in developed countries, but the number of patients referred for further investigations remain high. Moreover, the prevalence of CAD among these referred patients is only at 6%-12% (Douglas et al 2015, Therming et al 2018, Winther et al 2018. This means that many Non-CAD patients are exposed to risks associated with unnecessary and sometimes invasive testing that burdens the health care system with additional costs. Through analysis of heart sounds, phonocardiography (PCG) provides the possibility to do improved pretest likelihood estimation of CAD (a concept proposed by the ESC guidelines) using a fast, cheap, and noninvasive method. Coupled with clinical parameters such as age, gender, and blood pressure, this method has Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
shown to be effective for preliminary investigation of patients suspected of having CAD through ruling out Non-CAD patients from further testing (Makaryus et al 2013, Azimpour et al 2016, Winther et al 2018, Schmidt et al 2019, Winther et al 2021. Heart sounds have previously been investigated for spectral differences in attempts to detect CAD using phonocardiography (Semmlow et al 1983, 1993, Gauthier et al 2007, Schmidt et al 2007, 2011, 2015, Semmlow and Rahalkar 2007, Dragomir et al 2016, Winther et al 2016. Most studies have focused on analyzing the diastole and general findings have been that CAD causes an increase in energy of some frequencies. However, there are varying reports regarding at which frequencies this increased energy occurs. Semmlow et al reported in Semmlow et al (1983) an increase at 180-300 Hz and later in Semmlow et al (1990) for frequencies above 300 Hz. This is in line with findings from Akay et al in (1990Akay et al in ( ), (1993 of increased energy at frequencies of 300-800 Hz. Gauthier et al reported increased energy for frequencies above 130 Hz in Gauthier et al (2007), which is more in line with the initial findings from in Semmlow et al (1983). Successfully developed features based on frequencies above 240 Hz in Schmidt et al (2007). Later in Schmidt et al ( ), (2015, they found that features extracted from the lower frequency band (25-250 Hz) performed better in noisy conditions. More recently, in Winther et al (2016) and in Schmidt et al (2011) found the primary increase of energy occurring in the frequency range below 200 Hz, whereas reported this increase in energy for frequencies above 150 Hz (Dragomir et al 2016).
Whatever the reported frequencies, this increased energy of the diastole heart sound is typically explained as weak murmurs resulting from turbulent blood flow following the occlusion of coronary arteries, and simulated turbulence models (Jin-Zhao et al 1990) showing similar effects support this explanation.
Whereas previous studies show that some level of success for diagnosing CAD can be achieved through analysis of diastole heart sounds, other parts of the heart sound have not seen the same level of investigation. The focus on the diastole has existed, as Semmlow and Rahalkar puts it in Semmlow and Rahalkar (2007), because 'it is during this quiet period, that coronary blood flow is maximal, so that acoustic signature associated with turbulent blood flow is likely to be loudest'.
Left ventricular coronary blood flow is phasic, with high blood flow during diastole, and low (or even reverse) blood flow during systole. However, evidence suggests that this may not be the case for the right ventricular coronary blood flow, which could increase during systole rather than diminish (Goodwill et al 2017). Furthermore, showed in de Waard et al (2019) how stenosis of the LAD can affect the diastole and systole blood flow velocities. This suggests that there might be valuable information gained in analysis of heart sounds from the systole segment of the heartbeat in addition to the diastole for detection of stenotic murmurs. M Mansour et al showed in Mansour et al (2020) correlation between coronary artery calcium (CAC) score and diastole dysfunction (DD). This relationship may indicate that CAD patients have different relaxation patterns and valve-sounds. If this is the case, there might be differences in the first (S1) and second (S2) heart sounds that could be used for detection of CAD with S1 resulting from the closing of the mitral and tricuspid valves, and S2 resulting from the closing of the aortic and pulmonic valves. Other studies (Pathak et al 2020, Liu et al 2021 showed an increased performance by including the entire PCG recording instead of only the diastole segment, or by using simultaneous recordings at multiple sites on the chest. This suggests that there might be spectral differences in the PCG of CAD patients other than what can be observed in the diastole.
We hypothesize that there are spectral differences in heart sound recordings from patients with CAD and patients without CAD, and we will investigate this hypothesis for the four previously mentioned segments of the heart sound: S1, S2, systole and diastole. With a large database available, it is possible to investigate absolute spectral differences in heart sounds between CAD and Non-CAD patients. A typical approach for spectral analysis of heart sounds is to normalize the spectra to either a sub-frequency band of the spectrum or the total power in order to lower the influence of large inter-subject variation. However, this normalization can make it more difficult to determine exactly which frequencies are affected.
The investigation might reveal further unexploited components of the heart sound that can be used for improvement of algorithms for risk stratification of CAD. AdoptCAD and Dan-NICAD were approved by the Regional Committees on Health Research Ethics for Central Denmark, and BIO-CAC was approved by the Regional Committees on Health Research Ethics for Southern Denmark. All studies were conducted according to the Helsinki Declaration, and written informed consent was obtained from all patients.
The procedure for recording heart sounds was the same for all studies. First, subjects rested in supine position for 5 min prior to recording. Following this resting period, subjects continued to be in supine position while heart sounds were recorded using an Acarix CADScor device at the fourth intercostal space during 4 breath-hold periods of 8 seconds each.
Only subjects where at least one heart sound recording was available were included in the dataset. If multiple heart sound recordings were available, the first was selected. Subjects were excluded if their heart sound recordings failed pre-analysis qualification of the Acarix CADScor heart sound processing framework. In total, 453 subjects were excluded because either no recording was available or because the recording did not pass preanalysis qualification.
Patients were classified into one of three diagnostic categories: CAD, Non-CAD, and Unclassified. Patients with a coronary angiographic identified stenosis with at least 50% diameter reduction were classified as CAD. Patients with a CAC score of 0 and no coronary artery stenosis discovered through coronary computed tomography angiography were classified as Non-CAD. The remaining patients, who could not be classified into either of the two categories, were classified as Unclassified. Only patients with a clear diagnostic outcome of either CAD or Non-CAD were included for further analysis, which meant the exclusion of an additional 938 patients who were Unclassified.
Finally, heartbeats where annotation of S1 failed were discarded, and subjects with fewer than 5 remaining annotated heartbeats were excluded from further analysis, resulting in the exclusion of 62 subjects (19 CAD and 43 Non-CAD).

Preprocessing
An adaptive filter was used to reduce background noise using a simultaneous recording of room noise. Breathhold periods of phonocardiograms were segmented into single heartbeats using a duration dependent hidden Markov model (HMM) developed by in , and heart sounds (S1 (onset and offset), S2 (onset and offset), S3, and S4) were annotated automatically by the HMM.
A fourth order Butterworth band-pass filter with cutoff frequencies of 5 and 1000 Hz was applied to heartbeat signals. This was followed by a whitening filter designed to reduce the influence of spectral leakage and to emphasize the diastole differences between CAD and Non-CAD heart sounds as described in Larsen et al (2019). Finally, a second order Butterworth high-pass filter with a cutoff frequency of 20 Hz was applied.

Heartbeat alignment and heart sound segments
Before spectral analysis, heartbeats for the same subject were aligned to the onset of S1 and S2 respectively as shown in figure 1. Analyses of S1 and systole segments were performed with heartbeats aligned to onset of S1, and analyses of S2 and diastole segments were performed with heartbeats aligned to onset of S2.
The four heart sound segments analyzed in this article and shown in figure 1 were defined as described in table 2.

Frequency spectrum analysis
Four segments were investigated for spectral differences related to CAD: S1, S2, systole, and diastole segments as shown in figure 1. For all analyses, the frequency spectra were estimated using Welch's power spectral density (PSD) estimate with the function pwelch in MATLAB's Signal Processing Toolbox. Hamming windows were used in conjunction with pwelch with lengths as described in table 2. Table 1. Composition of pooled dataset. The dataset consists of pooled data from three studies: AdoptCAD, Dan-NICAD, and BIO-CAC. The composition of gender is fairly even, but the number of CAD subjects is low relative to the number of Non-CAD subjects. This is due to the low prevalence of CAD in the patient group. The number for original subjects is the number that were included in the source dataset before any exclusions. Statistics for number of CAD, Non-CAD, Gender, and Age are for patients included in the analysis. The category unclassified was not included in further analysis but is included in the A single average frequency spectrum was estimated for systole segments, whereas S1 and S2 segments were investigated using time-frequency analysis with several smaller subsegments for each segment as noted in table 2. The reason for this is that the S1 and S2 segments are highly dynamic.
The logarithm of the PSD was used for comparison of frequency spectra of CAD and Non-CAD subjects, and the spectrum difference between CAD and Non-CAD patients was calculated by subtracting the average spectrum of Non-CAD patients from the average spectrum of CAD patients.

Statistical analysis
A four-way ANCOVA was used to investigate statistical significance of observed average differences in the frequency and time-frequency spectra. To minimize the influence of demographic differences between the two groups (CAD and Non-CAD), independent variables for gender, age, and body mass index (BMI) were included in the model. Previous work (Larsen et al 2017) have shown some correlation between these parameters (in Figure 1. Multiple heartbeats from a heart sound recording of the same subjects aligned to the onset of (a) S1 and (b) S2. Analyses of S1 and systole segments were performed with alignment to S1 (a), whereas analyses of S2 and diastole segments were performed with alignment to S2 (b). The motivation for two different alignments can be seen in the difference of the S1 and S2 segments for the two alignments (a) and (b). (Recording: AC003-3005_001). Table 2. Overview of the settings for the spectral analyses. Heartbeats for the same subject were segmented and aligned to the onset of either S1 or S2. For systole and diastole segments, a single PSD was calculated for each of the two segments. For S1 and S2 segments, several subsegments were defined, and PSDs were calculated for each subsegment. A Hamming window was applied when estimating the PSDs with sizes as listed in the table. The square root of the PSD was used for evaluating the statistical significance of the frequency and time-frequency components.

Segment
Alignment Window Analysis and segment definition Systole S1 onset 64 ms Single segment from 16 ms after the offset of S1 to 16 ms before the onset of S2. Diastole S2 onset 128 ms Single segment from 175 ms to 450 ms after the onset of S2. If S3 or S4 sounds were annotated within this period, the segment was reduced to avoid the influence of S3 and S4. S1 S1 onset 64 ms The S1 segment was divided into 13 sub-segments, each of 64 ms length, with centers at [−64, −48, −32, −16, 0, 16, 32, 48, 64, 80, 96, 112, 128] ms in relation to the onset of S1. The frequency spectrum was calculated for each of these segments. S2 S2 onset 64 ms The S2 segment was divided into 9 sub-segments, each of 64 ms length, with centers at [−48, −32, −16, 0, 16, 32, 48, 64, 80] ms in relation to the onset of S2. The frequency spectrum was calculated for each of these segments. particular BMI) and the amplitude of S1 and S2. The model included the following independent variables: CAD Diagnosis (binary), Male Gender (binary), Age (continuous), and BMI (continuous). The linear regression model is written as follows: To investigate the statistical significance of frequency spectrum components in distinguishing between CAD and Non-CAD patients, the p-value of the Diagnosis-part of the model was evaluated. Each component of the frequency and time-frequency domains were evaluated separately using MATLAB's anovan function. Components with p-values for β 1 (diagnostic category) below 0.05 were considered statistically significant.
Lastly, the single-feature performance of each frequency and time-frequency component for classification of CAD and Non-CAD patients was evaluated for each of the four segments. For quantifying classification potential, the area under the curve (AUC) of a receiver operating characteristics curve was evaluated for each frequency and time-frequency component for each segment using MATLAB's perfcurve function. The results of these analyses were then collected in a plot for each segment showing the performance for each component as it related to that segment. A two-sided 95% confidence interval is included in the plots and were calculated using the method described by Hanley and McNeil in JA and BJ (1982).

Results
Heart sound recordings from a total of 1146 subjects were successfully analyzed (191 CAD and 955 Non-CAD). Frequency plots were obtained for systole and diastole segments, and time-frequency plots were obtained for S1 and S2 segments. Frequency plots for systole and diastole included reporting of spectra with modeled statistical significance, whereas separate plots for modeled statistical significance were made for S1 and S2. Results for each segment also include a plot of the mean power difference between CAD and Non-CAD patients, as well as a plot showing the classification AUC performance of each frequency and time-frequency component.
This section contains the results from the four segments analyzed in this paper: systole, diastole, S1, and S2.

Analysis of systole
Analysis of systole segments of the heart sounds produced the three plots figures 2(a)-(c).
There are significant differences of the average energies between CAD and Non-CAD patients at frequencies 15-200 Hz with a 95% confidence interval and peak difference at frequencies 30-125 Hz of roughly 2 dB. However, when accounting for age, gender, and BMI, the model only shows one of the tested frequencies (93.75 Hz) to be of statistical significance (p < 0.05).
Evaluation of the classification performance of the frequency components resulted in a plot which looks similar to the mean power difference. The performance was significantly different from random (50% AUC) performance between 15 and 200 Hz with a peak performance of 62.5% AUC for the frequency component 47 Hz.

Analysis of diastole
Analysis of diastole segments of the heart sounds produced the two plots figures 2(d)-(f).
There are significant differences of the average energies between CAD and Non-CAD patients at frequencies 0-500 Hz with a 95% confidence interval and a peak difference at frequencies 45-120 Hz of roughly 3 dB. When accounting for age, gender, and BMI, the model shows statistical significance (p < 0.05) for frequencies 40-95 Hz.
As was the case for the systole, evaluation of the classification performance of the frequency components for the diastole resulted in a plot which looks similar to the mean power difference. The performance was significantly different from random (50% AUC) performance for frequencies 0-500 Hz with a peak performance of 65% AUC at frequencies 47-63 Hz.

Analysis of S1
Analysis of S1 heart sounds produced the four plots shown in figure 3.
In figure 3(a), CAD subjects show on average higher energy of all frequencies before and after S1 and especially at lower frequencies immediately before the S1 sound with a positive difference at frequencies 20-160 Hz of 2-3 dB (CAD subjects have higher energy). The same frequency range can be seen after the S1 sound with a positive difference of 1-2 dB. The positive difference before S1 seems to be skewed towards higher frequencies when approaching the S1 sound. Whereas there is a positive difference before and after the S1 sound, there is a negative difference mid-S1 and in particular at frequencies 400-700 Hz where there is a negative difference of 1-2 dB.
As seen in figure 3(c), mainly frequencies above 400 Hz immediately before S1 show statistical significance. There seems to be a gradient where lower frequencies (near 400 Hz) have significance immediately before the S1 heart sound. As the S1 sound progresses, increasingly higher frequencies have statistical significance. Mid-S1 and post-S1 areas did not show any components with modeled statistical significance.
The AUC performance plot (figure 3(d)) looks similar to the mean power difference plot. The best performing areas do not overlap with areas that are of modeled statistical significance (figure 3(c)). Whereas the peak AUC are at around 63% AUC at low frequencies prior to the onset of S2 and 42% (inverse 58%) AUC at mid frequencies mid-S2, the performance of the areas showing statistical significance when accounting for age, sex, and symptoms, is only around 55% AUC.

Analysis of S2
Analysis of S2 heart sounds produced the four plots shown in figure 4.
In figure 4(a), CAD subjects show on average higher energy of all frequencies before and after S2, as well as frequencies around 30 Hz during S2. Frequencies 20-200 Hz before S2 have a positive difference of 2-3 dB, and frequencies 40-400 Hz after S2 have a positive difference of 1-2 dB. Mid-S2 has a negative difference, which is most pronounced for frequencies 300-600 Hz (2-3.5 dB).
As seen in figure 3(c), there are two areas of note when evaluating the statistical significance of S2 timefrequency components: the positive difference area prior to S2 at 50-90 Hz and the negative difference area early-S2 at 300-500 Hz. The smaller positive difference area mid-S2 at around 30 Hz also shows statistical significance, however with less significant p-values.
As was the case for S1, the AUC performance plot (figure 4(d)) for S2 looks similar to the mean power difference plot. However, in the case of S2 there is an overlap of areas of peak performance (figure 4(d)) and modeled statistically significance ( figure 4(c)). The low frequency pre-S2 peak performance is at around 65% AUC, and the early-S2 mid-frequency peak performance is at around 37% (inverse 63%) AUC. The lowfrequency area mid-S2 which also showed modeled statistical significance (although at a higher p-value) had an AUC performance of around 57%.

Discussion
Results from analysis of the diastolic segment associated an increase of absolute energy for low-frequency components (<200 Hz) with the presence of CAD, supporting similar findings in previous articles , 2011, 2015, Winther et al 2016. However, this study did not find any significant difference in energy for higher frequencies, which contrasts the findings of other studies , 1993, Schmidt et al 2007. One explanation could be that previous studies-for the most part-have examined relative energy, whereas the current study examined absolute energy.
Analysis of the systole yielded similar average differences of the frequency spectra as the diastole, however with roughly 1 dB lower difference of low-frequency components. Furthermore, only a single tested frequency showed modeled statistical significance in the same range as the diastole. Likewise, the single-component performance for the systole showed the same trend as for the diastole but with the diastole having superior performance at all frequencies. This indicates that the systole does not provide additional information to diagnosing CAD beyond what is contained in the diastole segment. This finding supports the argument stated among others by in Semmlow et al (1983), that coronary sounds are likely be loudest during diastole, as the contraction of the myocardium exerts pressure on the coronary arteries and attenuates any sounds associated with coronary narrowing. The confidence interval for the average energy was larger than that of the diastole, and this uncertainty may have been caused by the influence of S1 or S2 sounds in some heart beats. If this is the case, reducing systole segment length by increasing the buffer after S1 and before S2 could help alleviate this problem. However, this could mean that several subjects would not be included in this analysis due to the systole segment becoming too short. Furthermore, efforts to improving the annotation of the end of S1 could enhance the results of the systole frequency spectrum and reduce the confidence interval. Finally, it is possible to include the systole segment in the time-frequency analyses of S1 and S2 by expanding the segments after S1 and before S2.
Analyses of the S1 and S2 segments provided a novel comparison of spectral differences in these heart sounds of CAD and Non-CAD patients and yielded several new findings. S1 and S2 difference plots of the timefrequency spectra have similarities: both have positive differences before and after the valve sounds, which are pronounced for lower frequencies and negative differences during the valve sounds. The progression of the valve Figure 3. Results from the S1 analysis. (a) Progression of the average frequency spectrum for the S1 heart sound of CAD and Non-CAD subjects. Average energies for CAD and Non-CAD subjects have different progressions during the S1 heart sound. Whereas CAD subjects generally have greater energy before and after the S1 sound, they have a lower average energy during the S1 sound. (b) Difference of the average S1 time-frequency spectra between CAD and Non-CAD subjects. Positive differences mean that CAD subjects have higher energy on average and negative differences mean that Non-CAD subjects have higher energy on average. (c) Modeled statistical significance (p-value) of S1 time-frequency components. (d) AUC performance of S1 time-frequency components in classifying CAD and Non-CAD patients.
sounds for CAD subjects are on average 'more flat', meaning they progress slower and are of lower amplitude for CAD than Non-CAD subjects. However, whereas the negative difference mid-valve sound showed statistical significance for S2, this was not the case for S1. Likewise, the single component AUC performance of the midvalve sound was better for S2 than for S1. It is possible that the alignment of the S1 sounds is not as precise as the alignment of S2, and that improved annotation of S1 will change the findings for the S1 segment. The physiological explanation for the significance of the S2 sound could be that the relaxation pattern for CAD patients is different than for Non-CAD patients.
Similar to the findings of the systole and diastole segments, the S1 and S2 segments showed positive differences for frequencies below 200 Hz before and after the valve sounds. However, only a small area before S2 showed modeled statistical significance.
Results from analysis of S1 showed a large positive difference area before S1 for frequencies above 400 Hz. Although this difference is significant and could possibly contribute to new features for diagnosing CAD, the cause of this difference remains unexplained.

Conclusion and future work
Findings in this study show that additional information for improved pre-test likelihood estimation of CAD can likely be gained through analysis the S1 and S2 heart sounds. Future work will involve development of features for classification of CAD based on results presented in this article.