Identification of a characteristic VOCs pattern in the exhaled breath of post-COVID subjects: are metabolic alterations induced by the infection still detectable?

SARS-CoV-2 is expected to cause metabolic alterations due to viral replication and the host immune response resulting in increase of cytokine secretion and cytolytic activity. The present prospective observational study is addressed at exploring the potentialities of breath analysis in discrimination between patients with a documented previous history of symptomatic SARS-CoV-2 infection and, at the moment of the enrollment, exhibiting a negative nasopharyngeal swab and acquired immunity (post-COVID) and healthy subjects with no evidence of previous SARS-CoV-2 infection (no-COVID). The main purpose is to understand if traces of metabolic alterations induced during the acute phase of the infection are still detectable after negativization, in the form of a characteristic volatile organic compound (VOC) pattern. An overall number of 60 volunteers aged between 25 and 70 years were enrolled in the study (post-COVID: n.30; no-COVID: n. 30), according to well-determined criteria. Breath and ambient air samples were collected by means of an automated sampling system (Mistral) and analyzed by thermal desorption-gas chromatography-mass spectrometry (TD-GC/MS). Statistical tests (Wilcoxon/Kruskal–Wallis test) and multivariate data analysis (principal component analysis (PCA), linear discriminant analysis) were performed on data sets. Among all compounds detected (76 VOCs in 90% of breath samples), 5 VOCs (1-propanol, isopropanol, 2-(2-butoxyethoxy)ethanol, propanal and 4-(1,1-dimethylpropyl)phenol) showed abundances in breath samples collected from post-COVID subjects significantly different with respect to those collected from no-COVID group (Wilcoxon/Kruskal–Wallis test, p-values <0.05). Although not completely satisfactory separation between the groups was obtained, variables showing significant differences between the two groups and higher loadings for PCA are recognized biomarkers of COVID-19, according to previous studies in literature. Therefore, based on the outcomes obtained, traces of metabolic alterations induced by SARS-CoV-2 infection are still detectable after negativization. This evidence raises questions about the eligibility of post-COVID subjects in observational studies addressed at the detection of COVID-19. (Ethical Committee Registration number: 120/AG/11).


Rationale of the study
The rapid and significant spread over the last 2 years of severe acute respiratory syndrome coronavirus 2 (Sars-CoV-2) among worldwide population, resulting in more than 57 millions of people infected and tens millions of deaths (based on the latest reported estimations), has strongly compromised the global healthcare system and has led to social disruption and economic standstill [1,2]. The urgent need for containing the pandemic has therefore required a remarkable effort from scientists and stakeholders worldwide addressed to the prompt application of prevention measures and to the development of reliable diagnostic methodologies for the early detection of the virus and the management of the disease. The reverse-transcription polymerase chain reaction (RT-PCR) technique applied on nasopharyngeal and oropharyngeal swabs is recognized as the gold standard methodology for the virus detection and is currently used as a screening and diagnostic tool on a global scale [3]. Despite ensuring high sensitivity and specificity e.g., low levels of false positives and negatives, RT-PCR on nasopharyngeal and oropharyngeal swabs revealed to be a time-consuming procedure requiring equipped laboratories with high biosafety level and trained medical personnel and it is perceived as unpleasant and invasive by the patients. To fill the gap between the increasing demand for screening and the available analytical tools, remarkable attention has been recently paid by researchers in the medical and technological fields on the development and validation of alternative and noninvasive testing methods. In this regard, the most recent scientific literature reports preliminary promising outcomes from the application of breath analysis addressed to the early diagnosis of COVID-19. It appears evident, based on the existing knowledge in the field, that a screening methodology based on the metabolomic analysis of the exhaled breath, e.g., chemical characterization of volatile organic compounds (VOCs) and identification of the diseaserelated fingerprint, can be recognized as a strategic clinic approach to early detect different oncologic pathologies (i.e., lung cancer, colorectal cancer) and disease-related metabolic disorders (i.e., kidney failure, liver disease, Alzheimer's disease) [4][5][6][7][8]. More recently, based on the scientifically-validated association between the onset of respiratory tract viral infections and specific cellular metabolic changes leading to a VOCs pattern production and exhalation [9], scientists have been also explored breath analysis potentialities in the discrimination process between healthy population and COVID-19 patients for both diagnosis and follow-up [10][11][12][13][14]. The promising results to date obtained through the complementary application of sensors and advanced analytical techniques i.e, e-Nose, GC/MS, GC-IMS support breath analysis as a potential point-of-care test for large-scale SARS-CoV-2 screening campaigns [15][16][17]. Although further research is necessarily needed to confirm the preliminary outcomes in larger cohorts, to date a specific VOCs pattern allowing accurate discrimination between COVID-19 patients and healthy subjects or other disease affected-population groups has been tentatively identified in prospective observational studies e.g., benzaldehyde, 1-propanol, 3,6methylundecane, camphene, beta-cubebene, iodobenzene by Ibrahim et al (2021) (discrimination between COVID-19 patients and controls) [18]; methylpent-2-enal, 2,4-octadiene, 1-chloroheptane and nonanal by Grassin-Delyle et al (2021) (discrimination between COVID-19 and non COVID-19 acute respiratory distress syndrome patients) [19]; ethanal, octanal, acetone, butanone and methanol by Ruszkiewicz et al (2020) (discrimination between COVID-19 patients and subjects affected by other pathological conditions) [15]; octanal, heptanal and nonanal, recognized as subproducts of cell membrane destruction following oxidative stress, by Berna et al (2021) (pediatric cohort study pointing out the discrimination between SARS-CoV-2 positive and negative adolescents) [20]. In addition, the study carried out by McCartney et al showed that COVID-related VOCs signature may be variantspecific [21]. Concurrently with the validation of innovative diagnostic tools based on breathomics, the latest scientific production also witnesses a growing interest within the medical scientific community about the onset of cellular metabolic disorders in patients affected by COVID-19 (also in symptomatic mild form) and their potential persistence after the negativization [22]. The main mechanisms leading to the alteration of cellular metabolism, indeed, have been already elucidated e.g., the depletion of cellular metabolic resources for SARS-CoV-2 replication (mainly affecting cholesterol metabolism) and the depletion of angiotensin-converting enzyme 2 (ACE 2) with a direct negative effect on the antioxidant response and the upregulation of metabolites related to oxidative stress [23,24]. Therefore, human breathomics through the identification of the characteristic VOCs pattern related to the metabolic alteration induced by the SARS-CoV-2 infection could not only act as a diagnostic tool in infected subjects from the early stages but could also allow to determine whether and how long the metabolic alterations persist in individuals even after the negativization. This could be also a crucial issue in the COVID-19 management, starting from the assumption that the persistence of disorders in cellular metabolism in individuals who have completely recovered from the acute phase of the infection and exhibiting negative SARS-CoV-2 PCR test might be a predictive indicator of disease complications e.g., single or multiple organ dysfunction referring to post-COVID syndrome.

Objective of the study
In the present study, the authors explore the potentialities of breath analysis in identifying metabolomic breath signatures of patients characterized by a past history of symptomatic SARS-CoV-2 infection and, at the moment of the breath collection, negative for the virus and of control subjects with no documented previous infection. The main objectives of the investigation are: (a) to determine the extent of the discrimination between the two selected population groups; (b) to compare the specific VOCs pattern detected in SARS-CoV-2 negative patients (with previous symptomatic infection) with VOCs breathprint, reported in the existing literature, determined in SARS-CoV-2 positive patients (during the acute phase of the infection) in order to understand if traces of the metabolic alterations induced by the infection are still detectable.

Study design and population
A prospective observational study was carried out by the research group of the Environmental Sustainability Laboratory of the Department of Biosciences, Biotechnologies and Environment of University of Bari, in collaboration with the medical team of the Department of Emergency and Organ Transplantation (D.E.T.O.) of University of Bari, after approval of the Ethical Committee (Registration n. 120/AG/11). The research was conducted in accordance with the principles embodied in the Declaration of Helsinki and in accordance to the local statutory requirements. An overall number of 60 volunteers aged between 25 and 70 years were enrolled in the study, according to well-determined inclusion and exclusion criteria [25]. The enrollment of volunteers was carried out between July 2020 and February 2021, during the second epidemic wave in Italy. The overall number of subjects was equally split in two distinct groups, group A and group B. Group A (post-COVID) includes n. 30 health care professionals employed at D.E.T.O. that were characterized, with respect to the moment of the enrollment in the study, by a previous history of SARS-CoV-2 (documented by positive nasopharyngeal swab) and by a moderate symptomatic expression of COVID-19 disease (e.i., fever above 38 • C, upper respiratory tract symptoms such as persistent cough and shortness of breath). Moreover, group A-related subjects showed at the moment of the enrollment a negative nasopharyngeal swab and acquired immunity to COVID-19 (certified by a positive serologic test). Group B (no-COVID), instead, includes health care professionals working at D.E.T.O who had no evidence of previous SARS-CoV-2 infection. Additional exclusion criteria were applied in the process selection in order to minimize the confounding factors that could affect the experimental results: pregnant women and patients affected All subjects involved in the study, e.g., medical personnel, underwent nasopharyngeal/oropharyngeal swab every three weeks according to the hospital provisions and exhibited a negative swab within 24 h before the breath collection. The breath samples were collected at least 120 d (range from 120 to 250 d) after SARS-CoV2 infection healing and before the vaccination for COVID-19. All the volunteers involved in the study were also asked to refrain from eating, drinking and smoking for at least 12 h. Exhaled breath collection was standardized for all subjects and was carried out inside the same room of D.E.T.O. where the volunteers remained for at least 10 min before breath collection so that an equilibrium was created between the lung and ambient air (AA). All enrolled subjects gave written informed consent before inclusion in the study. Demographics data and information about comorbidities, chronic drug intake, smoking habit, alcohol consumption, family history of oncologic diseases were collected and are summarized in table 1. No significant differences exist between the two groups of the recruited subjects.

Breath sampling procedure
All the samples investigated in the present study were collected by an automated sampling device named Mistral. Mistral is an innovative medical device for breath sampling developed by R&D company named Predict s.r.l. (Bari, Italy) with the scientific support of the Department of Biosciences, Biotechnologies and Environment of University of Bari. Using disposable mouthpieces, the automated device is able to collect the end-tidal fraction of the exhaled breath and to directly transfer it onto suitable adsorbent cartridges at a selected sample flow rate of 200 ml min −1 . In detail, the device collects only the last 150 ml for each breath exhalation thanks to a volume control system and a sampling buffer. The adsorbent cartridges selected for the present study consist of two bed sorbent tubes packed with Tenax TA and Carbograph 5 TD (Bio-monitoring steel tube, Markes International Ltd, UK). The device is equipped with a thermal control system and a heating plate in order to avoid the formation of moisture along the sampling lines. The joint operation of the temperature regulation and control systems is indeed able to guarantee the temperature is stable over the time and comparable to the human body temperature (temperature range: 36 • C-37 • C). Before each breath sampling session, a preliminary cleaning procedure of the device is carried out by purging all the sampling lines with 1 l of air. Afterwards the sampling session starts: firstly the device proceeds with the collection of 750 ml of AA onto the dedicated adsorbent cartridge. A final volume of 750 ml of exhaled end-tidal breath is then collected by each volunteer calmly breathing while seated at rest. The potential breath sample contamination associated to the VOCs background of the device has been exhaustively evaluated in a previous study [26]. Based on the obtained outcomes, specific compounds such as 1,4 pentadiene, 2-hexanone, 6-methyl-5-hepten-2one and 2,2,4,6,6-pentamethyl-heptane were associated with device background as a result of device construction materials emission [27]. Therefore, these VOCs were excluded from the discussion of experimental results herein reported.

Chemical characterization: TD-GC/MS analysis
Once collected onto the adsorbent cartridges, VOCs were thermally desorbed and analyzed by means of a thermal desorber (UNITY-2, Markes International Ltd) coupled with a gas chromatograph (GC 7890, Agilent Technologies) and a mass selective detector (MS 5975, Agilent Technologies) located at CerBA laboratory. In the present study the applied analytical methodology has been further optimized with respect to our previously published studies [25,28]. For quality assurance, before each use, adsorbent cartridges were subjected to preconditioning at 330 • C for 30 min with pure helium (99.999%) at 50 ml min −1 flow, as recommended by the manufacturer, and then analyzed to verify VOCs background level. After conditioning, adsorbent cartridges were also properly stored at 4 • C until use. VOCs adsorbed onto the cartridges (both AA and breath samples) were thermally desorbed at 300 • C for 10 min (desorption flow 30 ml min −1 , splitless mode for both primary and secondary desorption steps) and then refocused onto a cold trap at 20 • C. The cold trap was properly selected in the optimization of the analytical methodology as it is specific for wet samples management (U-T4WMT-2S Water Management, Markes International Ltd, UK) and suitable to trap organic compounds in the range C 3 -C 20 . The cold trap was then flash-heated at 300 • C allowing VOCs to be promptly transferred via the inert transfer line (connecting the thermal desorber to the GC/MS system), heated at 200 • C, at the head of the GC column in a narrow band. The GC column used for VOCs characterization was 60 m × 250 µm × 0.25 µm film thickness with 5%diphenyl/95%dimethyl polysiloxane stationary phase (VOCOL®-Supelco). Carrier gas (helium) flow equal to 1.3 ml min was controlled by the constant pressure mode. The oven program used for optimal VOCs separation was: 37 Single target ions were extracted from TIC chromatograms (Extracted Ions Chromatograms, EIC mode) assisting VOCs identification. One quantifier ion and one/two qualifier ions were selected for each VOCs on the basis of their selectivity and abundance. Chemical compounds identification was based on the comparison of the obtained mass spectrum with those included in the National Institute of Standards and Technology library (2017) and considered positive by library search match >800 for both forward and reverse matching. Additional criteria for compounds identification were: (a) the matching of relative retention times (t R ) with those of the authentic standards within the allowed deviation of ±0.05 min; (b) the matching of ion ratios collected with those of the authentic standards within a tolerance of ±20%. Only chromatographic peaks having intensity higher than five times the baseline signal were integrated and the corresponding areas (compound abundances) were included in the data set. In order to highlight the most significant differences between the two population groups (post-COVID and no-COVID), all peaks in the collected chromatograms were explored even if a sure chemical attribution was not possible (tentative identification/indication of ions).

Chemometric data treatment
The differences in VOC levels between breath and AA background samples were analyzed via a paired t-test. If data exhibited a normal distribution a Shapiro-Wilk test was used, otherwise a Wilcoxon signed rank/Wilcoxon/Kruskal-Wallis tests were applied to the dataset. Differences between VOC levels in group A (post-COVID) and group B (no-COVID) were analyzed. All statistical tests were performed using the R software (v. 3.6.2, The R Foundation) and only p-values less than 0.05 were considered to identify species showing statistically significant difference at a confidence level of 95%. Moreover, in order to capture the maximum variability within the data, principal component analysis (PCA) was performed by R (v. 3.6.2, The R Foundation) using prcomp function and factoextra package. PCA is a multivariate data analysis technique that is used to reduce the dimensionality of the data while preserving its structure. PCA executes a singular value decomposition based on eigenvalues and eigenvectors to define a reduced data subspace utilizing a correlation matrix to enhance the influence of spectral features. Linear discriminant analysis (LDA) was applied to the dataset and validated by means of a leave-one-out cross-validation approach using R Archive Network (http://cran.r-project.org) and the lda() function of the MASS package. More specifically, to evaluate the classification accuracy, repeated cross-validation tests were conducted on random training set and test set obtained by main dataset.

Experimental results and discussion
Taking into account the current level of knowledge on COVID-19 pathophysiology and the experimental results to date obtained in several studies reported in the recent literature, SARS-CoV-2 infection would be detectable through the VOC pattern in the exhaled breath because the virus causes a systemic inflammatory response influencing human metabolism of sick subjects [12]. On the basis of the scientific evidences already acquired worldwide, the present study aims at understanding if the metabolism of subjects previously affected by SARS-CoV-2 remained altered even after negativization and complete recovery. Therefore, an overall number of 60 breath samples collected from subjects with and without a previous history of SARS-CoV-2 infection at the moment of the enrollment, were analyzed and GC/MS chromatograms allowed to identify about 100 compounds with main masses ranging from 39 and 168 m z −1 and retention times between 3.6 and 40.1 min. Among these compounds, 76 VOCs were found in more than 90% of the breath samples. The nonparametric Wilcoxon/Kruskal-Wallis test was applied to the data related to VOCs abundance detected in AA samples and in breath samples collected from no-COVID and post-COVID groups of subjects. The results showed that among all compounds detected, the abundances of only 11 VOCs in breath samples were significantly different with respect to those detected in ambient AAs and among these, only five VOCs showed abundance in breath samples collected from post-COVID subjects significantly different with respect to breath samples collected from the control group (p-values <0.05 as reported in table 3).
Then, unsupervised PCA was implemented on the VOCs with p-values less than 0.05. Across the first two principal components, overall 71% of the data variance is accounted for, even if a not completely satisfactory separation between the groups was obtained (figure 1).
According to the loadings of the variables, the most contributing descriptors are 1-propanol, propanal and isopropanol for PC1 and 4-(1 1-dimethylpropyl)phenol for PC2 ( figure 2). Even if poor visual clustering was obtained when the scores of the two data groups were displayed with respect to the first two principal components, the variables showing significant differences between the two groups and higher loadings for PCA are coherent with those reported in literature as biomarkers of COVID-19. As known, when a subject is infected with SARS-CoV-2, specific protein and metabolite changes are observed and more than 100 lipids are down-regulated in the blood [29]. The processes as SARS-CoV-2 entry and replication, humoral and cellular immunity and cytokine storm, can induce the formation of new VOCs or alter the normal VOCs composition in blood and thus in breath [30]. Moreover, when SARS-CoV-2 binds to the angiotensin-converting enzyme receptor 2 (ACE2), clearly distinct metabolic pathways within infected cells occur [31]. Studies conducted on coronavirus 229E (HCoV-229E) in kidney and liver cell cultures have highlighted altered metabolic pathways involving macrophage dysregulation, platelet degranulation and massive metabolic suppression [31], that potentially drive the formation of virus specific VOCs. Finally, COVID-19 also leads to an increase in oxidative stress determining a higher oxidation rate of fatty acids in addition to the lipo-peroxidation of cell membranes, determining the presence in the breath of alcohols, aldehydes and ketones [15,24,32,33]. VOCs as 1-propanol and isopropanol previously, identified as biomarkers of bacterial pneumonia [34], lung cancer [35,36] and asthma [37] were also found as biomarkers of COVID-19 in the studies conducted by Woollam et al [12]; Chen et al [11]; and Ibrahim et al [18]. In addition, aldehydes as propanal are derived, along with hydrocarbons, from lipid peroxidation and inflammatory processes and have been reported widely in several studies [38,39].
Moreover, as it is possible to observe in figure 1, the data related to group B (healthy control group) show a relatively higher intra-class variability with respect to the scores related to the post-COVID subjects (group A), most of which are present on the third quadrant of the PCA scoreplot. Probably, even if the VOCs dysregulated by viral infection tend to restore to baseline levels upon recovery [12,40], a memory effect of the virus affects the breath print of recovered subjects and probably contribute to reducing the variability of data. At this regard, when the linear discriminant function analysis (LDA) was applied to the collected data and validated by means of a leave-oneout cross-validation approach, an accuracy up to 77% (with sensitivity and specificity equal to 0.86 and 0.60, respectively) in clustering between post-COVID and no-COVID groups was obtained.
It is necessary to underline that better outcomes for this study could be obtained with larger sample cohorts. However these results are interesting and promising because applied to data collected from healthy and recovered healthy subjects and are not affected by therapeutics based on the use of drugs during the occurrence and course of Sars-COV-2 disease. In our opinion, in fact, the last mentioned aspect is the most limitation of the studies conducted until now on this issue [11].
To the state of authors' knowledge, there are few published studies exploring VOC differences within the post-COVID population group. In our opinion and on the basis of the outcomes of the present study, the post-COVID subjects could act as confounding factors. This aspect is crucial to evaluate the effective feasibility of breath analysis for COVID-19 diagnosis when considering disease relapse. However, in order to validate the results obtained in the present study, breath samples from the same subject and over the time after the virus infection are needed with the specific purpose to understand the temporal range within the potential metabolic alterations persist.

Conclusions and future perspectives
The most limitation of the previously published studies on the breath analysis aimed at the early diagnosis of subjects affected by COVID-19 is linked to the use of medications by these persons. Antiinflammatory medications are generally recommended for use in patients with Sars-COV-2 infection and their assumption could affect the VOC breath printing for these subjects. Moreover, to the state of our knowledge, few studies exploring VOC composition of breath exhaled from COVID-19 recovered subjects are been conducted until now. Deepening the issue on metabolic alterations after healing is crucial to understand if the recovered subjects can be interferencing factors in COVID-19 diagnosis and to evaluate the effective feasibility of breath analysis for COVID-19 diagnosis also considering disease relapse.
For this reason, in our study we have recruited both COVID-19 cured and healthy control subjects and the breath analysis of these subjects showed a pattern of VOCs linked to COVID-19, according to the outcomes reported in previous studies. Although the VOCs dysregulated by viral infection tend to restore to baseline levels upon recovery, a memory  effect of the virus probably affects the breathprint of recovered subjects. Further studies aimed to collect breath samples from the same subjects during the COVID-19 disease and over the time after the recovery, according to a specific experimental design, should be conducted to validate the results obtained in this study. Moreover, the development and validation of a method taking into account the evolution of VOCs breath pattern over the time could be also useful for the identification and diagnosis of the so-called Long-COVID, which occurs mainly in manifestations of lung damage in symptomatic and asymptomatic people at least 2-6 months after COVID-19 onset.

Data availability statement
The data cannot be made publicly available upon publication because they contain sensitive personal information. The data that support the findings of this study are available upon reasonable request from the authors.

Funding
This research received no external funding.