External validation of a novel signature of illness in continuous cardiorespiratory monitoring to detect early respiratory deterioration of ICU patients

Objective: The goal of predictive analytics monitoring is the early detection of patients at high risk of subacute potentially catastrophic illnesses. An excellent example of a targeted illness is respiratory failure leading to urgent unplanned intubation, where early detection might lead to interventions that improve patient outcomes. Previously, we identified signatures of this illness in the continuous cardiorespiratory monitoring data of intensive care unit (ICU) patients and devised algorithms to identify patients at rising risk. Here, we externally validated three logistic regression models to estimate the risk of emergency intubation developed in Medical and Surgical ICUs at the University of Virginia. Approach: We calculated the model outputs for more than 8000 patients in the University of California—San Francisco ICUs, 240 of whom underwent emergency intubation as determined by individual chart review. Main results: We found that the AUC of the models exceeded 0.75 in this external population, and that the risk rose appreciably over the 12 h before the event. Significance: We conclude that there are generalizable physiological signatures of impending respiratory failure in the continuous cardiorespiratory monitoring data.


Introduction
Patients in the intensive care unit (ICU) that experience respiratory failure leading to emergent intubation have significantly longer hospital length-of-stay and higher inhospital mortality (Politano et al 2013, Kim et al 2014, Moss et al 2016. Early identification of patients requiring intubation may allow earlier intervention to reduce morbidity and mortality. Patients in the early stages of respiratory failure might be given corticosteroids, bronchodilators, supplemental oxygen, or placed on noninvasive positivepressure ventilation (Nagata et al 2015). Patients for whom noninvasive positive-pressure ventilation is insufficient, or recently extubated patients that require re-intubation, might be intubated electively rather than emergently (Keim-Malpass et al 2018b, Liu et al 2018).
Timely intervention aimed at better preparation for intubation, team coordination, proper intubation medication selection, and avoidance of peri-intubation hypotension improve outcomes of patients requiring emergent intubation. These observations have never been more true than they are now, during the COVID-19 pandemic when emergency intubation endangers the providers as well as the patients (Alhazzani et al 2020).
Predictive analytics monitoring strives to identify high-risk patients earlier by providing continuous risk estimates to clinical personnel in real-time. These early warning signals can allow intervention to alter the patient's trajectory in a more favorable direction. In a prospective study using analogous predictive analytics for sepsis displayed at the bedside of preterm infants, mortality was reduced by more than 20% , Schelonka et al 2020. Similarly, in adults, predictive analytics for sepsis may reduce rates of septic shock and associated mortality by up to 50% (Shimabukuro et al 2017, Ruminski et al 2019. Candidate risk marker models for subacute potentially catastrophic illnesses like respiratory failure leading to urgent unplanned intubation (Ramachandran et al 2011, Kim et al 2014 are usually based only on static demographics and comorbidities, and intermittent vital signs and laboratory tests (Davis et al 2020). As a result, these models may reflect the decisions of clinicians more so than changes in patient physiology (Beaulieu-Jones et al 2021). That is, if the electronic health record shows that a clinician ordered a stat blood gas and chest x-ray, is it really a prediction to say that respiratory failure is imminent? If the physician thought of it first, do these analytics represent the leading indicators of a patient's illness or are they just the lagging indicators of clinicians' actions? On the other hand, models based on continuous cardiorespiratory dynamics from bedside physiological monitors have the advantage of directly reporting on the patient's condition. Clinician-initiated interventions do not directly influence them.
Before using a predictive analytical model for prospective clinical practice, it is vital to validate that model externally across different patient populations and institutions (Moons et al 2012, Collins et al 2014. Different care units and institutions may have substantially different distributions of demographics, socio-economic groups, admitting practices, and care patterns, all of which may degrade the calibration and discriminatory performance of a model. This study tested the hypothesis that predictive dynamic analytical models for respiratory failure leading to urgent unplanned intubation using continuously available data from ubiquitous physiologic monitors are well-suited for application at an external center.

Study design
We retrospectively studied a cohort of patients admitted to an ICU at the University of California San Francisco Medical Center (UCSF). We studied admissions to the two mixed medical/surgical ICUs, two neurological ICUs (NICUs), and coronary care (CCU) ICUs. Each ICU has continuous physiological monitoring archived by BedMaster (Hillrom, Chicago IL). The primary outcome was respiratory failure leading to urgent unplanned intubation.

Study population and primary outcome
We studied consecutive ICU admissions from 1 May, 2013, through 30 April, 2015, and selected patients intubated in the ICU. We excluded intubation events before ICU admission (e.g. in the operating room or emergency department (ED)) and patients with 'Do Not Resuscitate' or 'Do Not Intubate' orders (DNR/DNI). We classified events as planned (elective) or unplanned (emergent). Planned intubations included those done for procedures (such as endoscopy, interventional radiology procedures, or preceding elective operations) and others documented to be elective. We considered all others to be unplanned. We examined the procedure notes and the physician notes in individual medical records to determine the reason for intubation. We extracted the timing of intubation from the nursing and respiratory therapist's documentation. Two independent practitioners independently reviewed each potential case of emergent intubation.

Identification of mechanically ventilated patients
We excluded data for patients during epochs when they were already mechanically ventilated. Thus, it was necessary to know the time of all intubations and extubations (not only the emergent intubations). While we knew the times of emergent intubations, other intubation and extubation times were not available. We instead used ventilator respiratory rate flowsheet entries from respiratory therapists (RT) as a surrogate to identify periods of mechanical ventilation, merging the results with known times of emergent intubations. We reasoned that this parameter, which specifies that the respiratory rate was measured from the ventilator, was a sufficiently reliable indicator of the presence of mechanical ventilation.
We extracted the total ventilator respiratory rate flowsheet entry documented by RTs. Figure 1 (left) shows the probability density of time between RT documentation: most RT documentation of the ventilator respiratory rate occurred more frequently than every 6 h, consistent with clinical practice and affirming the utility of defining mechanical ventilation this way. Mechanical ventilation was defined as starting at the first recording of ventilator respiratory rate and ending at the last ventilator respiratory rate; see figure 1 (right) example patient 1. We split the period of mechanical ventilation when the time between consecutive measurements from a patient was larger than 16 h and identified the patient as not ventilated in the interim; see figure 1 (right) example patient 2. Isolated measurements (i.e. those without another measurement within 16 h) were used to define the start of a ventilation epoch with a duration of 1 h; see figure 1 (right) example patient 3. For emergently intubated patients, we verified that the epochs of mechanical ventilation started at the time of intubation. When emergent intubation occurred in the middle of an automatically determined ventilation epoch, we split the mechanical ventilation epoch defining the time of extubation as the time of the preceding ventilator respiratory rate and the time of intubation as the time of emergent intubation.

Predictors of emergent intubation
We calculated three risk estimates for emergent intubation (Politano et al 2013, Moss et al 2016. The models were developed on data from SICU and MICU patients at the University of Virginia (UVa) Health System and use only continuous cardiorespiratory dynamics calculated from the bedside physiological monitors. From Politano et al (2013) we used the vital signs only model. This model was developed on a subset of surgical ICU patients, excluding those on ancillary services. We also used the SICU and MICU models of Moss et al (2016). We previously validated the model of Politano at UVa for predicting upgrade from surgical intermediate care unit (IMU) to ICU, with and without intubation (Blackburn et al 2017). The SICU model of Moss was also validated at the same institution as part of a model for identifying low-risk patients at the time of surgical IMU and ICU discharge (Blackburn et al 2018).
All three models are binary logistic regression. The Politano model uses linear relationships between predictors and the response, while the Moss models include cubic splines to allow for non-linear relationships. The Politano model was built with a forward stepwise procedure, while the Moss models use a pre-specified feature set based on prior clinical knowledge. The output of each model is the estimated probability of emergent intubation, though with different time horizons: 24 h for the Politano model, 4 h for the Moss MICU model, and 6 h for the Moss SICU model.
Model inputs are were calculated in 30 min windows with 50% overlap: means and standard deviations of, and cross-correlations between, vital signs (heart rate, respiratory rate, peripheral oxygen saturation, and blood pressure), as well as statistical measures of cardiac dynamics (slope of log RR interval variance versus log scale for detrended fluctuation analysis (Peng et al 1994), coefficient of sample entropy (Lake and Moorman 2011), and the standard deviation of RR intervals).
Missing data were imputed with the median values from the UVa development cohort. We divided the estimated probability by the average probability of emergent intubation in the training set to yield the fold-increase in the probability of the event, which we label relative risk. Features and models were calculated using CoMET ® (AMP3D, Charlottesville, VA). No features or models were available to the care team, and all patients received standard of care. The UCSF Institutional Review Board approved this study with a waiver of consent.

Statistical analysis
We evaluated the performance of the UVa urgent intubation risk models for identifying UCSF patients before emergent intubation. We used the continuous risk estimates from the three models and, where appropriate, a binary response variable. The response was defined as '1' for event patients during the time window immediately preceding the emergent intubation (the event population). The response was defined as '0' for those patients that were never emergently intubated (control patients) as well as for event patients far from the time of emergent intubation.
We analyzed the dynamic changes of each model near the time of emergent intubation. We aligned the every-15 min risk estimates relative to the time of emergent intubation (defined as time zero). For each model, at each 15 min epoch, we calculated the average predicted risk for all event patients with data and plotted the average time series relative to the time of emergent intubation. We used the Wilcoxon sign rank test every 15 min to test the null hypothesis that risk estimates were equal to risk estimates for the same patient 12 h prior. At time points where we rejected the null hypothesis at 0.05 significance level, we interpreted this as a significant change in estimated risk that may have provided an early warning.
We evaluated the discriminatory value of each model using the area under the receiver operating characteristic (AUC), also known as the C-statistic. We evaluated the performance of risk estimates for varying event time window definitions: window start times ranged from 4 to 24 h before emergent intubation. The window ended at the time of emergent intubation. Confidence intervals were determined by 200 bootstrap runs, resampled by hospital admission.
Finally, we evaluated the calibration of each model. For each model, we calculated the deciles of predicted relative risk. We calculated the observed risk in each decile as the proportion of measurements that corresponded to times within 12 h of emergent intubation and divided by the average risk from the training set. We then plotted the observed versus predicted relative risk: perfectly calibrated models fall on the line of identity.

Results
We studied 9828 admissions for 8434 patients to UCSF. There were 240 episodes of emergent intubation in 238 hospital admissions. There was continuous monitoring data before 225 (93%) of events. Table 1 shows the characteristics of the study population. Patients with emergent intubation were less likely to be white, more likely to be Asian, had 3-fold higher mortality, and stayed 19 d longer in hospital. We calculated 105.5 patient-years of risk estimates: the three risk estimates were calculated for 3.7 million 15 min epochs. We censored 32.3% of measurements for patients who were already mechanically ventilated or patients following DNI orders. The incidence of emergent intubation was 0.9 per 100 ICU d, significantly lower than in the Moss (Moss et al 2016) and Politano (Politano et al 2013) development cohorts (2.1 and 2.8 per 100 ICU d, respectively). Figure 2 shows the length of time patients were in the ICU and not mechanically ventilated before emergent intubation: about 120 patients were in the ICU and not mechanically ventilated at least 24 h before the event. The figure also shows the number of events with continuous monitoring data available for modeling. Figure 3 shows the time course of risk estimates for UCSF patients using the three UVa models for urgent unplanned intubation during the 48 h preceding emergent intubation. Risk estimates doubled over the 12-24 h before emergent intubation, from 1.5 to more than 3. At each time, we performed a signed-rank test with the null hypothesis that risk estimates are equal to risk estimates from the same patient 12 h prior. White points identify times when we rejected the null hypothesis. Risk estimates were significantly higher beginning about 5 h before emergent intubation. Figure 4 shows the performance of the three models as a function of the time window before emergent intubation. All the data from patients without emergent intubation served as 'control.' Thus, for a 24 h detection window, the data for event patients within 24 h of emergent intubation were identified as 'event,' and the data from event patients far from the event were 'control.' The AUC rose from about 0.75 to about 0.78 for event windows 24 to 4 h, respectively. The MICU model from Moss had slightly better performance than the SICU models from Moss or Politano. Figure 5 shows the calibration of the three models using a 12 h detection window. Wellcalibrated models have predicted risk equal to observed risk (dashed line). The SICU models from Moss and Politano exhibited excellent calibration in all but the highest risk patients, where they overestimated the risk. The MICU intubation model from Moss also overestimated risk for high-risk patients and deviated from the line of identity in the two lowest deciles of predicted risk.

Discussion
This study evaluated the performance of 3 predictive analytical models for early detection of respiratory failure leading to emergent intubation at an external center. These models are based solely upon analyses of continuous cardiorespiratory monitoring data to detect physiologic signatures of illness. The existing models were initially developed to predict emergent intubation in the UVa SICU (two models) and MICU (one model). This study sought to externally validate the performance using an independent cohort of patients from all ICUs University of California San Francisco. The major findings were that the models accurately identified patients at risk for emergent intubation (AUC > 0.75 starting at 12 h pre-intubation), risk estimates continued to rise until the time of emergent intubation, and these predictive models were well-calibrated.
Whereas elective intubation in the operating room is a highly safe procedure, emergent intubation outside the operating room is commonly associated with complications (Rochlen et al 2017, Wardi et al 2017, Arya et al 2019. These complications include hemodynamic compromise or severe hypoxemia leading to cardiac arrest, esophageal intubation, aspiration, and pneumothorax (Heffner et al 2013, Wardi et al 2017. Cardiac arrest attributable to intubation complications occurs in 2%-4% of emergent intubations, and few of these patients survive to hospital discharge even when initially resuscitated (Mort 2004, Heffner et al 2013, Rochlen et al 2017. Identifying patients earlier who are at risk of needing emergent intubation maybe vital in avoiding these poor outcomes (Rochlen et al 2017, Keim-Malpass et al 2018a. Recent studies link poor outcomes to potentially mitigatable factors that are time-dependent. These include a worse outcome for patients intubated around the change of nursing shift (Wardi et al 2017). Wardi et al proposed that this finding resulted from potential staff fatigue, hand-off errors, and lack of familiarity with a patient, which led to overlooking subtle but important changes in patient condition (Wardi et al 2017). For each of these potential explanations, predictive algorithms such as those validated in this current study might overcome these challenges at vulnerable moments for our critically ill patients. Essentially, these algorithms convert complex data that is difficult for a clinician to assimilate in real-time into risk estimates based on detecting illness-specific signatures. The result is an early warning signal displayed at the bedside that can draw attention to critical changes in the patient's condition. Here, we found that the pathophysiological signature of respiratory failure in medical and surgical ICU training sets transferred well to identify emergent intubation in the medical, surgical, neurological, and coronary care ICU cohorts of the validation set. This finding comports with the results of Moss et al who found that respiratory failure had a similar signature of illness between surgical and medical ICU patients (Moss et al 2016).
Additionally, our data affirm that the clinical deterioration before respiratory failure leading to urgent unplanned intubation is often a slowly progressive process over many hours. This time frame creates an opportunity to perform interventions that would make emergent intubation safer and even possibly avoided. A pre-intubation checklist integrating interventions to treat hemodynamic instability led to fewer complications (Jaber et al 2010, Rochlen et al 2017, Wardi et al 2017. This is consistent with investigations that have correlated the worse outcomes in emergent intubation in the ED occurring in those that develop hemodynamic compromise following intubation (Heffner et al 2013, Trivedi et al 2015. Commonly, patients requiring emergent intubation are in shock, and an increased shock index highly correlates with intubation-associated cardiac arrest (Trivedi et al 2015, Wardi et al 2017. Volume depletion, vasodilation, acidosis, and reduced venous return resulting from increased positive pressure ventilation are all factors addressable in the lead time provided by an early warning signal. Trivedi et al argue that the shock index (the ratio of maximum heart rate to lowest systolic blood pressure) is a helpful adjunct to intervene in the immediate 60 min preceding intubation (Trivedi et al 2015). However, they acknowledge that addressing the 'dynamic changes in patient status in the ICU…require continuous monitoring and interpretation of data before the development of overt hypotension and cardiorespiratory collapse.' We propose that implementing and integrating predictive analytics monitoring into clinical practice (Prudente Moorman) may provide an opportunity for timely clinical action, as the prediction of respiratory failure leading to urgent unplanned intubation can be present and growing for hours pre-intubation. We cannot definitively determine from this study if interventions during this window would avoid intubation, but it would likely make them safer. Both of these possibilities warrant further investigation in a prospective study. Any such prospective study should leverage processes for optimal integration of predictive analytics for evoking clinical action (Keim-Malpass et al 2018a).
This study was limited in several ways. We relied on surrogate data to initially identify patients who had mechanical ventilation initiated during their ICU stay. It is possible that due to documentation errors, the location of intubation initiation could have been misclassified. However, to avoid this, each potential case of ICU intubation was reviewed by two clinicians to verify the location of intubation, the reason for intubation, and the timing of intubation. Similarly, we attempted to identify tracheostomy patients that may go on and off mechanical ventilation and excluded those with mechanical ventilation initiation following tracheostomy. Finally, we did not exclude patients with more than one event of respiratory failure leading to urgent unplanned intubation; this happened, however, only in two of the 238 patients.
Our results show that all models are well-calibrated for the lowest 90% of predicted risk but overestimate the risk in the highest decile: patients in the highest decile have 3-to 4-fold higher risk of respiratory failure than average, but the estimated risk is 30-to 40-fold higher than average. Accurate calibration at low and moderate risks may allow these models to be used for accurate clinical assessment of patient status as well as response to interventions. Overestimation at higher risk may limit such uses at these levels but still may draw attention to high-risk patients as intended. It may be useful to cap predicted risk estimates for clinical implementation to mitigate this issue for practical purposes.
Determining the exact timing of emergent intubation was challenging in this study. Physician documentation through intubation procedure notes was not a reliable source of the timing as it was apparent in chart review that this documentation is often delayed. This is understandable as the physician's attention around the event is focused on providing bedside care. To address this, we confirmed the intubation time with the first nursing documentation of administered intubation-related medications cross-referenced with the RT documentation of initial ventilator settings. In all cases, these times were within 5 min of each other. A more accurate evaluation of the lead time of the prediction would be to use the exact moment that the clinical decision was made to intubate. We do not know this, of course, but we know from the clinical review of the records that the actual intubation came only a short time later. We note that any delay makes the predictive model look better because the worsening derangements of the patient status make for higher risk estimates. We defend the practice as the best available and superior to a standard method of using the highest model estimate observed during the hospitalization, even if the model estimate took place weeks before the event.
In addition, we did not quantify the performance of predictive models in the context of known risk factors: diagnoses, demographics, or severity of illness (Ramachandran et al 2011, Kim et al 2014. We note, however, that these known risk factors are static indicators; though they may identify high-risk patients, they do not rise leading up to the time of lung failure. Finally, we did not evaluate other predictors of emergent intubation, such as shock index from vital signs and laboratory measurements. Adding independent streams of information improves performance (Moss et al 2017), though we found that plug-and-play models with bedside physiological monitors have excellent performance.

Conclusion
Earlier identification of signatures of illness using continuous cardiorespiratory monitoring that arise from subtle changes in physiologic deterioration may provide a valuable adjunct to clinicians to mitigate the need for emergent intubation. All measurements within 6 h are combined into a single epoch (patient 1), whereas measurements separated by more than 16 h are split into multiple epochs (patient 2), and isolated measurements are defined as 1 h epochs (patient 3). Note that patient 2 was intubated twice, but only two patients had two respiratory failure events leading to urgent unplanned intubation.  Average time course of risk estimates over the 48 h leading up to the time of emergent intubation. Relative risk is the fold-increase in the probability of an event with respect to average. The gray ribbon is the 95% confidence interval around the mean. White points indicate that the risk estimates at that time are significantly higher (p < 0.05) than risk estimates 12 h prior.  Area under the receiver operating characteristic as a function of the window size before emergent intubation defined as the event, from 4 to 24 h. The 95% confidence interval is indicated by error bars and was determined by 200 bootstrap runs resampled by admission.

Author Manuscript
Author Manuscript Author Manuscript Calibration curves for the three models for emergent intubation. The observed relative risk is plotted as a function of the predicted risk. Each point represents 10% of the data, and the line of identity (perfect calibration) is shown as a dashed line.