This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Paper The following article is Open access

Machine learning enabled detection of COVID-19 pneumonia using exhaled breath analysis: a proof-of-concept study

, , , , , , , , , , , , , , , , , , and

Published 13 March 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Ruth P Cusack et al 2024 J. Breath Res. 18 026009 DOI 10.1088/1752-7163/ad2b6e

1752-7163/18/2/026009

Abstract

Detection of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) relies on real-time-reverse-transcriptase polymerase chain reaction (RT-PCR) on nasopharyngeal swabs. The false-negative rate of RT-PCR can be high when viral burden and infection is localized distally in the lower airways and lung parenchyma. An alternate safe, simple and accessible method for sampling the lower airways is needed to aid in the early and rapid diagnosis of COVID-19 pneumonia. In a prospective unblinded observational study, patients admitted with a positive RT-PCR and symptoms of SARS-CoV-2 infection were enrolled from three hospitals in Ontario, Canada. Healthy individuals or hospitalized patients with negative RT-PCR and without respiratory symptoms were enrolled into the control group. Breath samples were collected and analyzed by laser absorption spectroscopy (LAS) for volatile organic compounds (VOCs) and classified by machine learning (ML) approaches to identify unique LAS-spectra patterns (breathprints) for SARS-CoV-2. Of the 135 patients enrolled, 115 patients provided analyzable breath samples. Using LAS-breathprints to train ML classifier models resulted in an accuracy of 72.2%–81.7% in differentiating between SARS-CoV2 positive and negative groups. The performance was consistent across subgroups of different age, sex, body mass index, SARS-CoV-2 variants, time of disease onset and oxygen requirement. The overall performance was higher than compared to VOC-trained classifier model, which had an accuracy of 63%–74.7%. This study demonstrates that a ML-based breathprint model using LAS analysis of exhaled breath may be a valuable non-invasive method for studying the lower airways and detecting SARS-CoV-2 and other respiratory pathogens. The technology and the ML approach can be easily deployed in any setting with minimal training. This will greatly improve access and scalability to meet surge capacity; allow early and rapid detection to inform therapy; and offers great versatility in developing new classifier models quickly for future outbreaks.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The emergence of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in 2019 has resulted in over 6.7 million deaths worldwide, mostly due to COVID-19 pneumonia and respiratory failure [1]. SARS-CoV-2 is primarily transmitted via the respiratory route [2] and its detection relies on real-time-reverse-transcriptase polymerase chain reaction (RT-PCR) on nasopharyngeal swabs (NPS) [3, 4]. The false-negative rate of this approach may be as high as 33%, particularly late in the course of the disease when viral burden and infection is mainly in the lower airways. In this situation, lower airway sampling is required, via bronchoscopy, which poses significant risk of viral transmission to those performing the procedure [3, 5, 6]. Furthermore, access to bronchoscopy is not equitable and in sick patients may be contraindicated. Therefore, an alternate method for lower airway sampling that is safe and easily accessible can greatly facilitate the rapid and early diagnosis of COVID-19 pneumonia, and allow earlier implementation of appropriate therapy.

Exhaled breath may be an important source of lower airway sampling, as it contains several hundreds of metabolites from the distal airways and the lung parenchyma. Indeed, earlier breath analysis studies utilized gas chromatography-mass spectrometry (GCMS) to identify volatile organic compounds (VOCs) as biomarkers for infectious and neoplastic processes in the lung, [79]. However, GCMS requires expensive equipment, trained personnel and precise calibration. Alternatively, laser absorption spectroscopy (LAS) offers a different approach to disease detection by utilizing a myriad of metabolites from a single exhaled breath sample [10]. Together with data-driven machine learning (ML) classification models, a vast amount of identified metabolites can be processed to provide unique patterns of VOCs as 'breathprints' for the detection of COVID-19 pneumonia. In the present study, an ultra-sensitive LAS-based technique, using cavity ring-down spectroscopy (CRDS), was used to discriminate between breath samples from SARS-CoV-2 positive and negative individuals. The objectives were to: (1) explore the LAS-spectral breathprints identified by ML to discriminate between SARS-CoV-2 positive and negative individuals and (2) the discriminatory property of this approach. The secondary objective was to examine for unique VOCs that can provide insights into the pathobiology of COVID-19 pneumonia.

2. Methods

2.1. Study design

This was a prospective, unblinded, observational proof of concept study. Patients admitted with positive SARS-CoV-2 by RT-PCR on NPS and symptoms of SARS-CoV-2 infection between 11 February–13 August 2021, were recruited from 3 tertiary centers in Ontario, Canada. The study was approved by the research ethics committee of each participating site and registered on clinicaltrial.gov (NCT04867213). All participants provided informed written consent. The inclusion criteria were 18 years or older with sufficient exhaled breath collection for analysis. A similar group of healthy individuals or hospitalized patients without respiratory symptoms and negative SARS-CoV-2 RT-PCR were enrolled as controls. Participants suspected of COVID-19 despite a negative RT-PCR test, those with chronic lung disease or current smokers of tobacco, cannabis or electronic cigarettes within 4 h and/or alcohol intake within 8 h of exhaled breath collection were excluded. The demographic and clinical data of participants were collected either by questionnaire or from medical health records, and were obtained at the time of exhaled breath collection.

2.2. Exhaled breath collection

Exhaled breath was collected using a proprietary exhaled breath sampler (Breathe BioMedical, Moncton, New Brunswick, Canada) which tracks CO2 levels to collect alveolar breath into Tenax TA sorbent tubes. Participants exhaled into the breath sampler while sitting or standing without the use of a nose clip and instructed to breathe deeply and exhale through a single-use bacterial/viral filter (SunMed FH603003) on the sampler's mouthpiece with the procedure repeated until 5-litre (l) samples were amassed (see figure 1(A)). CO2 was recorded via a real-time sensor to confirm the alveolar portion of breath sample was collected. Participants breathing patterns were monitored throughout breath collection. Duplicate sampling of subjects was prohibited. Further information is detailed within the supplemental materials.

Figure 1.

Figure 1. Schematic representation of the principle of the breath collection device. (A) An exhaled breath sample was collected into Tenax TA desorption tubes through a single use filter on the sampler's mouthpiece. (B) Breath sample with identifying tag is shipped to central lab for analysis. (C) Cavity ring-down spectroscopy is a highly sensitive laser spectroscopy that measures the absorption of light in a closed optical cavity using highly reflective mirrors placed at each end of the cavity. By recording the decay times of light, the absorption of light is measured to indicate the concentration of compounds. (D) Example infrared spectrum of a sample. (E) Measured spectra were used to develop a supervised machine learning classification model to discriminate SARS-CoV-2 positive from negative samples.

Standard image High-resolution image

Breath samples were collected into Tenax TA desorption tubes and stored at −20 degrees Celsius until analysis with the exception of transport to and from the analysis site. Tenax tubes were chosen rather than sampling bags because of their strong hydrophobic nature and consistent retention of exhaled breath VOCs [11]. Breath samples were analysed within two weeks of sampling. Prior to shipment to collection sites, all sorbent tubes were conditioned and batch tested to ensure low background levels. All sorbent tubes were used within two weeks of conditioning.

2.3. Cavity ring-down spectroscopy (CRDS) measurements and machine learning techniques

Mid-infrared profiles were measured from the 5 l breath samples (desorbed at 300 degrees Celsius) using CRDS. The CRDS instrument was designed and built in-house and is a highly sensitive laser spectroscopy technique that detects trace chemicals by measuring the absorption of light in a closed optical cavity. Each full measurement took approximately 60 min. Three tunable CW lasers, 12C(16O)2, 13C(16O)2 and 12C(18O)2, provided 204 wavelengths between 9.0–11.25 µm (1100–890 cm−1). This wavelength range is in the heart of the fingerprint region, a part of the electromagnetic spectrum where organic compounds show signature infrared absorption. Specifically, isoprene, methanol and ammonia are among common breath VOCs with distinct signatures. The ringdown times at each wavelength were measured with pure nitrogen (τ0) and with sample (τ).

The path length of light interacting with the breath sample can be increased by many folds using highly reflective mirrors placed at each end of the cavity. By recording the decay times of light in the empty versus the breath-filled cavity, the absorption of light by the constituent chemicals within human breath is measured. The absorption spectrum can indicate the concentration of a compound with sensitivities in the parts-per-billion range/level [12].

2.3.1. Machine learning classification model

The measured spectra were used to develop a supervised ML classification model that discriminates SARS-CoV-2 positive from negative samples. First, any missing absorption coefficients were replaced in each spectrum using linear interpolation and then rescaled using vector normalization, using a previously validated approach [1315]. Next, first-order spectral derivative sequences, each comprising of 191 values were extracted from the normalized breathprints and used as features for classification. The features that provided the most useful information were identified using a variant of the minimum redundancy maximum relevance algorithm [16], which ranks features based on their correlation to the class labels (SARS-CoV-2 positive or SARS-CoV-2 negative) and prioritize features that provide unique information. Following this step, the number of features retained for classification was optimized using classification performance. The maximum number of allowed features was fixed at 20 to avoid overfitting and model complexity. A linear support vector machine (SVM) learning approach was used for classification. This uses features from a set of training samples to construct an algorithm which then act as a decision boundary for classifying future samples [1315].

Two validation approaches were used to assess the ML-classifier's performance, a non-nested and nested leave-one-out cross-validation (LOOCV). The non-nested approach utilized all samples for preprocessing and feature selection, while the nested approach, only utilize the training set for preprocessing and feature selection to avoid leaking information from the test set. The standard non-nested LOOCV framework provides a single optimal feature set that is fixed during model training and testing, while the nested approach results in multiple optimal feature sets (one for each training set created during the cross-validation procedure). Both approaches were used, since the non-nested method tend to yield more optimistic estimates and the nested method yields more pessimistic estimates. Therefore the true performance of the ML-classifier lies between the estimates from the two approaches [13, 17].

An iterative process of training and validation across a range of sample sizes was employed to examine how the amount of training data may impact classifier performance. Learning curves were generated by iteratively incrementing the training sample size and re-assessing the classification model, starting with ten random participants, and increasing in increments of ten randomly selected participants. This procedure was repeated ten times for each model, and the performance estimates were averaged to create the learning curves. The class sizes were balanced in each data subset until sample sizes exceeded 106, at which point only COVID-negative subjects remained.

2.4. VOC stepwise fitting analysis

For the secondary objective, a stepwise fitting method was used to fit the measured spectra to a library of compounds [13]. The library comprises of reference absorption data from the Pacific Northwest National Laboratory and the high-resolution transmission molecular absorption (HITRAN) database [18]. There are a total of 502 compounds in the library, of which 133 compounds are present in human breath. Quantification analysis of all 133 VOC compounds was performed for each breath sample and compared between SARS-CoV-2 positive and negative groups by the Mann-Whitney U test. SVM classifier models were developed using the VOC concentrations as features, to assess whether this approach may offer higher performance classifier models than using LAS-spectra as features. For feature selection, the VOCs were filtered based on their significance level from Mann-Whitney U testing (i.e. p < 0.05) and were further optimized using classification accuracy, as with the breathprint model. Only VOCs that appeared in at least 25% of samples were considered for the statistical comparisons and classification models. Isoprene and exogenous VOCs were removed from consideration as features in the SVM model. The SVM model was trained with and without ammonia to assess if the presence of ammonia resulted in differing sensitivity or specificity of the model.

2.5. Statistical analysis

Descriptive analysis was used to summarize the data. Differences in baseline characteristics between groups were assessed using Chi-squared tests (categorical variables), independent t-tests (for normally distributed variables), and the Mann-Whitney U-test (for non-normal continuous variables). The nested and non-nested classification performance estimates were compared for categorical variables using Fisher's exact test, and an independent samples t-test for continuous variables. A p-value <0.05 was considered statistically significant. Statistical analysis was performed using statistical package for social sciences for Windows Version 27 (IBM, Armonk, NY, USA).

3. Results

There were 135 patients enrolled and 115 patients provided sufficient exhaled breath samples for inclusion into the final analysis (53 SARS-CoV-2 positive and 62 controls). The number of participants excluded and the reasons for exclusion were similar between the SARS-CoV-2 positive and negative groups suggesting that COVID-19 pneumonia did not adversely impact the feasibility of exhaled breath collection (figure 2).

Figure 2.

Figure 2. Study flow diagram.

Standard image High-resolution image

Of the included participants, SARS-CoV-2 patients had higher mean body mass index (BMI); higher prevalence of coronary artery disease and insulin-dependent diabetes mellitus; and received higher fraction of inspired oxygen (Fi02) therapy (table 1). Many SARS-CoV-2 patients (67%) at the time of breath sample collection had radiographic bilateral lung infiltrates and 62% required supplemental oxygen therapy. Among the SARS-CoV-2 patients, 14 had breath sample collection within 7 d of symptom onset (early) and 37 had breath collection after 7 d (late), while in two patients, the time of collection to symptom onset was unknown.

Table 1. Subject demographics.

VariableSARS-CoV-2 Positive (n = 53)SARS-CoV-2 Negative (n = 62) p-value
Male (%)36 (67.9)32 (51.6)0.08
Mean age in years (S.D.)57.7 ± 17.157.5 ± 12.80.95
Mean BMI (S.D.)29.9 ± 7.327.0 ± 6.00.02
Smoking status (%):   
Current1 (1.9)3 (4.8) 
Ex-smoker11 (20.8)13 (21.0) 
Never smoker39 (73.6)42 (67.7)0.60
Covid variant (%):   
Original18 (34.0)  
Alpha30 (56.6)  
Other mutation3 (5.7)  
Unknown2 (3.8)
Radiological evidence of pneumonia (%):   
Unilateral7 (13.2)  
Bilateral35 (66.0)  
None9 (17.0)  
Unknown2 (3.8)
Comorbidities (%):   
Hypertension19 (35.8)15 (24.2)0.17
Dyslipidemia15 (28.3)15 (24.2)0.62
Coronary artery disease9 (17.0)1 (1.6)<0.01
Asthma4 (7.5)4 (6.5)0.82
COPD3 (5.7)2 (3.2)0.52
Chronic kidney disease2 (3.8)1 (1.6)0.47
IDDM9 (17.0)3 (4.8)0.03
NIDDM12 (22.6)11 (17.7)0.51
Medication (%):   
Dexamethasone36 (67.9)  
Remdesivir4 (7.5)  
Tocilizumab6 (11.3)
Requiring supplemental oxygen therapy33 (62.3)0<0.01
Median-inspired FiO2 (IQR):28 (21–32)21.0<0.01

Definition of abbreviations: BMI = body mass index; COPD = chronic obstructive pulmonary disease; Fi02: fraction of inspired oxygen; IDDM = insulin-dependent diabetes mellitus; IQR = interquartile range; NIDDM = non-insulin dependent diabetes mellitus; S.D. = standard deviation. Parametric variables were compared using Chi-squared test for categorical variables and independent samples t-test for normally distributed variables, Nonparametric variables were compared using the Mann-Whitney U-test.

3.1. Identification of SARS-CoV-2 in breath samples using machine learning classifier algorithm

The median LAS-spectra for the two groups are shown in figure 3. As shown, the median LAS-spectra for both groups were similar at higher wavelengths but differed significantly at the lower wavelength spectrum. The ML-classifier model derived from these breathprints achieved a non-nested LOOCV accuracy of 81.7% (77.4% sensitivity, 85.5% specificity) with 7 first derivative features. The corresponding nested LOOCV accuracy for the model was 72.2% (67.9% sensitivity, 75.8% specificity) with an average of 12.4 features selected across training sets. The receiver operating characteristic curves for the SVM scores obtained with the non-nested and nested LOOCV frameworks are depicted in figure 4. The area under the curve was 0.851 for the non-nested LOOCV approach, and 0.727 for the nested LOOCV method. Additionally, we obtained learning curves for both the nested and non-nested LOOCV scenarios for incremental increase in the sample size of data for training (figure 5). This showed that the incremental increase in accuracy for the ML-classifier model began to level off with sample size greater than 50 breath samples. Furthermore, the performance of the ML-classifier was robust and consistent across stratified subgroups using pre-specified participants' characteristics (tables 2 and 3). There was no significant association between the level of misclassification with participant characteristics such as sex, smoking status, the onset of COVID-19, SARS-CoV-2 variant type, time from symptom onset and breath sampling, BMI, age, Fi02 requirements and the presence of chronic kidney disease or diabetes mellitus.

Figure 3.

Figure 3. Median CRDS spectra for SARS-CoV-2 positive (n = 53) and negative (n = 62) patients.

Standard image High-resolution image
Figure 4.

Figure 4. ROC curves for (a) the SVM scores obtained with LOOCV, and (b) the SVM scores obtained with nested LOOCV. The operating points representing an SVM score threshold of 0 are indicated.

Standard image High-resolution image
Figure 5.

Figure 5. Learning curves for the non-nested and nested CRDS breathprint models.

Standard image High-resolution image

Table 2. Non-nested LOOCV classification performance by variable.

 SARS-CoV-2 Positive (n = 53)SARS-CoV-2 Negative (n = 62)
VariableTPFN p-valueTNFP p-value
Sex (%):      
Female12 (70.6%)5 (29.4%) 28 (93.3%)2 (6.7%) 
Male29 (80.6%)7 (19.4%)0.4925 (78.1%)7 (21.9%)0.15
Age (S.D.):59.4 ± 17.351.8 ± 15.90.2057.5 ± 11.857.8 ± 19.90.95
BMI (S.D.):30.3 ± 7.728.8 ± 5.70.5427.3 ± 5.725.2 ± 7.70.34
Smoking status (%):      
Current0 (0%)1 (100%) 3 (100%)0 (0%) 
Ex-smoker8 (72.7%)3 (27.3%) 14 (93.3%)1 (6.7%) 
Never smoker31 (79.5%)8 (20.5%)0.2434 (81.0%)8 (19.0%)0.65
Chronic kidney disease (%)1 (50.0%)1 (50%)0.40 (0%)1 (100%)0.15
Insulin or non-insulin dependent diabetes mellitus (%)15 (71.4%)6 (28.6%)0.512 (85.7%)2 (14.3%)0.99
Covid variant (%):      
Original17 (94.4%)1 (5.6%)    
Alpha22 (73.3%)8 (26.7%)    
Other mutation2 (66.7%)1 (33.3%)0.15
Onset (%):      
⩽7 d12 (85.7%)2 (14.2%)   
>7 d28 (75.7%)9 (24.3%)0.70 
Requiring supplemental oxygen therapy (%):      
Yes23 (69.7%)10 (30.3%)    
No18 (90.0%)2 (10.0%)0.10
FiO227.9 ± 8.331.6 ± 10.60.21

Definition of abbreviations: BMI = body mass index; Fi02 = fraction of inspired oxygen; FN = false negatives; FP = false positives; LOOCV = leave-one-out cross-validation; TN = true negatives, TP = true positives. Data are presented as mean ± standard deviation for continuous variables and % for categorical variables. Fisher's exact test was used to assess associations for categorical variables (sex, smoking, onset, variant, oxygen therapy) and a two-sample t-test was used for continuous variables (age, BMI).

Table 3. Nested LOOCV classification performance by variable.

 SARS-CoV-2 Positive (n = 53)SARS-CoV-2 Negative (n = 62)
VariableTPFN p-valueTNFP p-value
Sex (%):      
Female10 (58.8%)7 (41.2%) 24 (80%)6 (20%) 
Male26 (72.2%)10 (27.8%)0.3623 (71.9%)9 (28.1%)0.56
Age (S.D.):57.5 ± 18.558.2 ± 13.70.9056.7 ± 12.060.1 ± 15.20.43
BMI (S.D.):30.0 ± 7.429.8 ± 7.20.9527.7 ± 5.524.7 ± 7.00.09
Smoking status (%):      
Current1 (100%)0 (0%) 2 (66.7%)1 (33.3%) 
Ex-smoker7 (63.6%)4 (36.4%) 12 (80%)3 (20%) 
Never smoker26 (66.7%)13 (33.3%)0.9931 (73.8%)11 (26.2%)0.89
Chronic kidney disease (%)1 (50.0%1 (50.0%)0.540 (0%)1 (100%)0.24
Insulin or non-insulin dependent diabetes mellitus (%)12 (57.1%)9 (42.9%)0.2312 (57.1%)2 (14.3%)0.48
Covid variant (%):      
Original20 (66.7%)10 (33.3%)    
Alpha14 (77.8%)4 (22.2%)    
E484K mutation2 (66.7%)1 (33.3%)0.78
Onset (%):      
⩽7 d11 (78.6%)3 (21.4%)    
>7 d25 (67.6%)12 (32.4%)0.51
Requiring supplemental oxygen therapy (%):    
Yes20 (60.6%)13 (39.4%)    
No16 (80.0%)4 (20.0%)0.23   
FiO228.1 ± 8.630.1 ± 9.50.47

Definition of abbreviations: BMI: body mass index; Fi02: fraction of inspired oxygen; FN = false negatives; FP = false positives; LOOCV = leave-one-out cross-validation; TN = true negatives, TP = true positives. Data are presented as mean ± standard deviation for continuous variables and % for categorical variables. Fisher's exact test was used to assess associations for categorical variables (sex, smoking, onset, variant, oxygen therapy) and a two-sample t-test was used for continuous variables (age, BMI).

3.2. VOC stepwise fitting analysis

Four compounds (2-Methyl-1-propanal; Ammonia; Phenol; and Ethene) were found to be significantly different between the SARS-CoV-2 positive and negative groups (figure 6). The SVM-classifier model derived using VOC features including ammonia achieved a non-nested LOOCV accuracy of 74.7% (71.6% sensitivity, 77.4% specificity) utilizing three compounds and a nested LOOCV accuracy of 63.4% (83.0% sensitivity, 46.7% specificity) utilizing an average of 3.5 compounds across LOOCV training sets. The SVM model classifier model derived using VOC features excluding ammonia achieved a non-nested LOOCV accuracy of 61.7% (62.2% sensitivity, 61.2% specificity) utilizing three compounds and a nested LOOCV accuracy of 60.0% (83.0% sensitivity, 40.3% specificity) utilizing an average of 3.0 compounds across LOOCV training sets.

Figure 6.

Figure 6. Dot plot of the four highest ranked VOCs in each patient for both SARS-Cov-2 positive and negative groups.

Standard image High-resolution image

4. Discussion

This study demonstrates that a machine learning-based breathprint model using CRDS measurements may potentially provide a valuable non-invasive option for detecting SARS-CoV-2 in exhaled breath samples. Current guidelines recommend repeating RT-PCR tests for SARS-CoV-2 in cases of high clinical suspicion or worsening symptoms, and lower airway sampling may assist further in the diagnosis. However, lower airway sampling via bronchoscopy is invasive and aerosol-generating, resulting in an elevated risk of viral transmission due to environmental contamination. The use of exhaled breath to detect SARS-CoV-2 from the lower airways holds great promise as a simple, non-invasive, and accessible technology that can be easily deployed widely in any settting with minimal training. It has the potential to achieve a broad reach into the community within the healthcare environment that will facilitate the rapid and early diagnosis of SARS-CoV-2 infections particularly in the lower airways. Furthermore, the same technology and approach may be employed to develop unique breathprints for detecting other lower respiratory pathogens in future pandemic preparedness and response.

Our ML-breathprint classifier achieved a non-nested and nested accuracy of 81.7% (77.4% sensitivity, 85.5% specificity) and 72.2% (67.9% sensitivity, 75.8% specificity) respectively. As the nested model is known to be inherently pessimistic due to the reduced information available during feature selection, the true generalizable accuracy lies between the nested and non-nested results [19]. The WHO had previously recommended that SARS-CoV-2 tests that met the minimum performance requirement of ⩾80% sensitivity compared to the gold standard RT-PCR tests could be used to diagnose SARS-CoV-2 in suspected cases [20]. However, false-negative RT-PCR results have been reported in up to 33% of cases, with detection rates dropping to 40% after 5 d of symptoms [21]. Despite this, our study showed no significant effect of duration of symptom onset on diagnostic accuracy. Our ML-breathprint classifier demonstrated a false negative rate of 14.2% for patients diagnosed ⩽7 d and 24.3% for patients diagnosed >7 d (p = 0.70) using the non-nested analysis. Moreover, we found the performance of the breathprint model was robust and unaffected by the patients' sex, age, smoking status, BMI, Fi02 requirements and the presence of chronic kidney disease or diabetes mellitus.

In this study, we identified 4 VOCs that were distinctively different between the SARS-CoV-2 positive and negative groups. The SVM-classifier developed using these 4 unique VOC compounds including ammonia achieved a non-nested LOOCV accuracy of 74.7% (71.6% sensitivity, 77.4% specificity) and a nested LOOCV accuracy of 63.4% (83.0% sensitivity, 46.7% specificity). The SVM-classifier model excluding ammonia achieved a non-nested LOOCV accuracy of 61.7% (62.2% sensitivity, 61.2% specificity) and a nested LOOCV accuracy of 60.0% (83.0% sensitivity, 40.3% specificity). Ammonia is one of the most abundant VOCs in breath and its breath concentration can be a marker for multiple disease states including but not limited to diabetes mellitus, renal failure, hepatic encephalopathy and the presence of H. pylori causing peptic ulcers [22]. Exline et al developed a rapid, non-invasive breathprint test to detect SARS-CoV-2 pneumonia with 88% accuracy upon their admission to the intensive care unit using exhaled nitric oxide and ammonia [23]. We found the inclusion of ammonia in our SVM- classifier model resulted in improved accuracy compared to excluding ammonia. Our SARS-CoV-2 postive population were symptomatic, oxygen dependent and hospitalised with 79.2% of patients demonstrating pneumonia radiologically, similar to the population studied by Exline et al. There was no significant difference in the prevalence of non-insulin dependent diabetes or chronic kidney disease between our SARS-CoV-2 positive and negative population which would explain our results, however there was significantly more patients with insulin-dependent diabetes within our SARS-CoV2 positive population (SARS-CoV-2 positive n = 9, SARS-CoV-2 negative n = 3, p = 0.03). However, the presence of ammonia may be secondary to increased oxidative stress secondary to SARS-CoV-2 infection, and the inclusion of ammonia may aid in earlier disease detection. The breathprints ML model found no significant effect for insulin-dependent diabetes on classification performance, supporting the findings of our SVM-classifier model.

The SVM-classifier model accuracy including and excluding ammonia was lower than the ML-breathprint classifier which achieved a higher non-nested accuracy of 81.7% (77.4% sensitivity, 85.5% specificity) and a nested accuracy of 72.2% (67.9% sensitivity, 75.8% specificity). This suggests that the breathprint model is more accurate in detecting SARS-CoV-2 positivity than VOC analysis alone. However, when comparing the two models, it should be noted that the VOC model was trained using a maximum of only 4 features, while the breathprint model had a maximum of 20 features that could be selected during optimization.

To date one emergency use authorisation has been issued by the FDA for the InspectIR device, a portable, rapid GCMS test which performs a qualitative analysis for five VOCs in the ketone and aldehyde families [24]. This device reported a sensitivity and specificity of 91.2% and 99.3% respectively in a study of 2409 individuals tested from November 2020 to May 2021 prior to the emergence of the delta variant surge in the United States of America. The InspectIR device study prospectively collected breath samples within 5 mins of RT-PCR test of symptomatic and asymptomatic participants. In contrast, our study focused on symptomatic patients hospitalised with pneumonia, with a time lag from RT-PCR test to breath collection due to the nature of SARS-CoV-2 disease resulting in a delay from diagnosis and the development of pneumonitis. These differences preclude a direct comparison of both studies. The FDA also reported a further clinical study which focused on patients infected with the omicron variant had similar sensitivity. However, the VOC signature can differ between SARS-CoV-2 variants with differing specificity and sensitivity which has important implications when developing VOC based breath tests as samples from differing waves may need to be analysed separately, rather than using one universal approach [25]. Therefore, future studies of the InspectIR device should assess its ability to identify its ability to detect SARS-CoV-2 with differing variants, patients with severe disease, and those with pre-disposing conditions.

Previous breath studies, which mainly focused on individual VOC have reported variable findings. To date there has been no unifying set of VOC compounds which have shown to distinguish SARS-CoV-2 positive from negative groups across these studies [8, 24, 2628]. Each study has identified VOCs belonging to different chemical groups including aldehydes (methylpent-2-enal, benzaldehyde, octanal, heptanal, nonanal), ketones (acetone and 2-butanone) and VOCs yet to be identified. Our study found an exhaled aldehyde 2-methyl-1-propanal could be used as part of a SVM-classifier to differentiate between SARS-CoV-2 positive from negative groups. Aldehydes are derived from inflammatory processes and are elevated in exhaled breath from patients with acute respiratory distress syndrome [29] and asthma exacerbations [30]. Furthermore, exhaled acetaldyde have been shown to correlate linearly with intranasal viral loads of influenza A in infected swine [31]. However, in-vitro human cells inoculated with influenza A or Streptococcus Pyogenes alone or together, showed that only cells with bacterial infection released high levels of aldehydes [32]. Therefore, elevated levels of aldehydes may not be specific in differentiating SARS-CoV-2 from bacterial infections. Future studies of exhaled VOCs are needed to validate our findings and to compare aldehyde levels in SARS-CoV-2 patients with and without superimposed bacterial infection.

Since the appearance of SARS-CoV-2 in 2019, the virus has undergone continuous mutations, with a small fraction of these mutations giving rise to variants of concern [33]. It has been shown that SARS-CoV-2 mutations and variants can affect the performance of the gold standard RT-PCR [34], resulting in reduced efficacy and sensitivity [35]. A strength of our study is that we included patients with SARS-CoV-2 genomic variants and found no significant effect on the performance of the breathprint-classifier model. This is clinically relevant as the mutation of the SARS-CoV-2 genome continues, which necessitate the ongoing need for re-evaluation of the RT-PCR performance. This is both costly and time-consuming, and results in a diagnostic lag with the emergent of each new variant.

To our knowledge, this is one of the largest exhaled breath studies to date [8, 27, 3642] and contains a diverse population of mild to severe SARS-CoV-2 patients. Ibrahim et al previously reported identifying SARS-CoV-2 positive patients using GCMS to identify a set of seven exhaled breath VOCs [8]. However, their study included patients admitted during the first SARS-CoV-2 wave in the United Kingdom in 2020, with more severe pneumonia. Notably, over 90% of their SARS-CoV-2 positive patients demonstrated bilateral pneumonia compared to 66% in our study. Furthermore, 31% of their SARS-CoV-2 patients were RT-PCR negative, even though the clinical suspicion was high, thus adding a level of uncertainty to the interpretation of their findings. This is unlike our study which only included SARS-CoV-2 patients with RT-PCR positive test, while those with a high clinical suspicion but negative SARS-CoV-2 on RT-PCR were excluded. This provided a more defined study patient population, and more reliable breathprint results. Furthermore, the robust performance of our breathprint-classifier in mild infections will become increasingly important with the emergence of variants which are highly transmissible but causing less severe disease. Ibrahim et al did not report the time of symptom onset compared to time of breath sampling, compared to our study which demonstrated there was no significant association between the level of misclassification for symptom onset (p = 0.51). There is to be expected a time lag between symptom onset and the development of SARS-CoV-2 pneumonia or hospitalisation, ranging from 3 to 10.4 d [43]. Vancheri et al [44] previously reported that early in SARS-CoV-2 time course, reticular changes are more common, while consolidation is less frequent with an increasing trend over time, similar to Pan et al who demonstrated lung abnormalities on chest CT showed greatest severity approximately 10 d after symptom onset.

Two prior studies used laser spectroscopy of exhaled breath with one study using laser spectroscopy of pharyngeal samples, to identify SARS-CoV- positive from negative patients. Shlomo et al utilized a Fourier-transform infrared spectroscopy approach to examine exhaled breath, demonstrating a sensitivity and specificity of 100%. However, the authors included only asymptomatic patients and mainly men recruited from the emergency department [45]. Furthermore, the clinical characteristics including variant type and steroid use were not recorded, and therefore limit the generalisation of their study findings. Liang et al examined exhaled breath utilizing a cavity enhanced direct frequency comb spectroscopy (CE-DFCS) approach to identify SARS-CoV-2 positive from negative university students [42]. The authors found a pattern-based approach (AUC of 0.849) outperformed a molecular-based approach (AUC of 0.769). The CE-DFCS approach is similar to our methodology of CRDS, however the authors included a younger population (median age 23.0 and 22.0 in the SARS-CoV-2 positive and negative groups respectively), with mild disease as no participants were severely ill or required hospitalisation at the time of their sample collection. Furthermore, variant type and time between symptom onset and breath sampling was not recorded. Barauna et al also used a Fourier-transform infrared spectroscopy approach to discriminate SARS-CoV-2 positive and negative patients, however they used pharyngeal samples of hospital patients with their approach achieving a sensitivity of 95% and specificity of 89% [46]. Clinical characteristics including variant type and steroid use were not recorded. Similar to our study Barauna et al found the breathprint region was 1800–900 cm−1 which corresponds to our findings that the median LAS-spectra for both groups were similar at higher wavelengths but differed significantly at the lower wavelength spectrum (figure 3). An advantage of our study over all three studies is our study showed the performance of our breathprint model was robust in relation to the patients' sex, age, smoking status, variant type, time since disease onset and breath sampling, BMI and Fi02 requirement. Furthermore, the performance of our breathprints model for the diagnosis of more severe patients is important to reduce the requirement for aerosol generating procedures such as bronchoscopy in PCR negative patients.

There are limitations associated with this study, including a relatively small sample size. However, our learning curve analysis suggest that the performance of ML-breathprint classifier may not improve substantially with a larger sample size. It should be noted, that the classification performance could be further improved if new model architectures (different preprocessing, features, classification models, etc) are developed, possibly informed by a larger dataset. Secondly, our control group excluded patients with respiratory symptoms or respiratory conditions including other respiratory pathogens. Further work is needed to assess if our models can differentiate between SARS-CoV-2 from other respiratory pathogens. Thirdly, both cohorts had a small subgroup of patients with chronic kidney disease (SARS-CoV-2 positive n = 2, SARS-CoV-2 negative n = 1, p = 0.47), insulin-dependent diabetes mellitus (SARS-CoV-2 positive n = 9, SARS-CoV-2 negative n = 3, p = 0.03) and non-insulin dependent diabetes mellitus (SARS-CoV-2 positive n = 12, SARS-CoV-2 negative n = 11, p = 0.51), however the breathprints ML model found no significant effect for these comorbidities on classification performance.

In conclusion, our study examined a novel approach to detecting SARS-CoV-2. This comprised of a user-friendly hardware which required minimal training to collect exhaled breath from the lower airways, an ultrasensitive LAS method to detect VOCs, which classified by machine learning algorithm for the presence of disease based on unique patterns of VOCs. In addition to the clinical benefit of providing an alternate method for studying the lower airways, this technology and the approach can provide improved accessibility and scalability to meet surge capacity, rapid detection and versatility in developing new models quickly using the same technology for future outbreaks.

Acknowledgments

Jennifer Wattie—shipping specimens and sample management

Caroline Munoz—generating eCRF

Alex Chiasson—preparing the data for analysis

Sudipta Das, PhD—preparing the data for analysis

Samuel Peter, PhD—preparing the data for analysis.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Authors and contributors

M D, T O S, R J, G M G and T H were responsible for the conception and design of the study. R P C, M R, J L R, C E W, T S, E C, K H, C S, R J, T H and M D were responsible for the collection and assembly of the data. R L, C B M, M B, P F P, E S, G B and S G were responsible for expertise in breath analysis as a whole, study technology and breath data analysis. R P C and M D were responsible for verifying clinical data and data analysis. R P C, R L, C B M, and G B had access to the raw data. R P C, R L, C B M, E C, S G, G B, G M G, and M D were responsible for analyzing and interpreting the data. R P C, R L, C B M, E S, G B, G M B and M D were responsible for drafting the article. All authors were responsible for the final approval of the article. R P C and M D were responsible for the decision to submit the article for publication. T O S and M D were responsible for obtaining funding. K H, C S and G M G provided regulatory oversight.

Conflict of interests

M B, P F P, S G and G B were employees of Breathe Biomedical at the time of the study. All other authors have no relevant conflicts of interest.

Please wait… references are loading.

Supplementary data (<0.1 MB PDF)

10.1088/1752-7163/ad2b6e