1 Introduction

We have previously described quantification of pulmonary gas exchange from machine learning (ML) analysis of bedside monitoring data [1]. In that study we showed by computer simulation that trained ML analysis of blood gas, indirect calorimetry, and cardiac output measurements can generate gas exchange assessments based on an adaptation of West’s ventilation/perfusion (V/Q) lung model [2, 3]. The assessments resemble multiple inert gas technique (MIGET) reports, albeit free of MIGET’s technical challenges [4, 5].

The reference methodology in this ‘scaled back’ MIGET space is the Automatic Lung Parameter Estimator (ALPE) [6,7,8,9], which analyses a similar suite of inputs. Both methods address a major limitation of the three-compartment lung model of Riley and Cournand [10, 11], which groups all oxygen transfer deficits within its shunt compartment and reports them as ‘venous admixture’ (VenAd). As highlighted previously [1], the ability to distinguish the relative influences of true shunt versus low V/Q effects in hypoxemia can influence management in conditions such as COVID-19 pneumonia [12].

Our ‘adapted West’ model enables separate quantification of these components by partitioning VenAd into ‘shunt’ and ‘low V/Q’, where low V/Q = VenAd—shunt. Both indices are expressed as percentages of pulmonary blood flow. ALPE quantifies low V/Q by a separate metric, as a notional PO2 gradient between alveolar gas and pulmonary end-capillary blood [9].

A drawback common to both methods is the need to collect data at more than one inspired oxygen fraction (FiO2), introducing potential signal distortion from absorption atelectasis and altered hypoxic pulmonary vasoconstriction [13, 14]. For high fidelity estimates the adapted West method currently requires blood gases measured at two structured FiO2 settings [1], whereas ALPE institutes as many as four FiO2 shifts for an extended series of pulse oximetry measurements [8].

It has become apparent that the ‘Two–FiO2’ requirement of our approach could be eliminated by incorporating mean alveolar PCO2 (mean PACO2), measurable at the bedside using volumetric capnography [15]. With these additional monitoring data, just one set of measurements collected at any operating FiO2 should suffice. If shown to be accurate and reproducible, this ‘Single–FiO2’ method would enable critical care practitioners to distinguish shunt from low V/Q contributions to VenAd rapidly without altering the operating FiO2.

We therefore tested the following hypotheses in silico:

  1. 1.

    Data from blood gas analysis, indirect calorimetry, cardiac output, and volumetric capnography measurements analyzed in combination can quantify oxygenation deficits in terms of percentage shunt flow (V/Q = 0) versus percentage low V/Q flow (V/Q > 0).

  2. 2.

    Consistent reports can be generated by ML analysis of data collected solely at the operating FiO2.

2 Materials and methods

We tested the above hypotheses through a ‘reverse engineering’ simulation.

The simulation in brief (Fig. 1).

  1. 1.

    Using the adapted West lung model, we generated blood gas and mean PACO2 data at various FiO2 settings from simulated indirect calorimetry and cardiac output measurements across a range of shunt values, arterial hemoglobin—oxygen saturations (SaO2), and acid- base and hemoglobin oxygen affinity conditions.

  2. 2.

    We divided these data into a training set and a test set.

  3. 3.

    With the training set we programmed a ML application to recover shunt values solely from Single-FiO2 ‘bedside’ monitoring data.

  4. 4.

    We then tasked the trained ML application with the ‘blinded’ recovery of the test set shunt values.

  5. 5.

    Precise recovery of shunt values using only ‘bedside’ data would document the accuracy and reproducibility necessary for clinicians to determine shunt contributions to VenAd.

  6. 6.

    Low V/Q contributions could then be calculated as ‘VenAd—shunt’.

Fig. 1
figure 1

Architecture of the ‘reverse engineering’ method. ‘ML’ is Machine Learning. ‘Venad’ is venous admixture, calculated as per Eq. 10 in Supplementary Material. Other abbreviations as in Tables 1 and 2

2.1 The lung model

The ‘adapted West’ model currently runs via VBA sub-routines on Excel (Microsoft, Redmond, WA). Now updated to include N2 exchange, it incorporates log normal distributions of pulmonary blood flow across 20 compartments spanning a broad range of V/Q ratios, plus a separate shunt compartment (V/Q = 0). A model description including core equations is available in the Supplementary Material.

2.2 The simulation in more detail

More than 15,200 unique combinations of model inputs within pre-defined ranges (Table 1) were created by a Python program. For each combination the adapted West model generated blood gases and mean PACO2 values with the FiO2 adjusted for SaO2 ≥ 0.87 ≤ 0.98 (Table 2). As with our previous ‘Two-FiO2’ method evaluation [1], simulated scenarios encompassed a cross section of O2 consumption (VO2) and delivery, CO2 production (VCO2) and transport, hemoglobin-oxygen affinity, and respiratory and metabolic acid–base status.

Table 1 Monitoring inputs used by the model to generate the ranges of scenario blood gases and PACO2 values in Table 2
Table 2 Model-generated blood gas and mean PACO2 data

The final dataset was subjected to simple randomization, then divided into a training set of 14,736 data—rows and a test set of 500 data—rows. We settled for 500 rows as the final test set rather than 10% of the total dataset (approximately 1500 rows) since these were sufficient for meaningful statistical analysis. Following ML training and validation (see below), shunt values were recovered for each test scenario (n = 500) with the true shunt values ‘held back’.

Recovery was performed solely by ML analysis of the following ten ‘bedside’ data elements: FiO2, blood hemoglobin concentration (Hb), SaO2, arterial pH, arterial oxygen tension (PaO2) arterial CO2 tension (PaCO2), mean PACO2, cardiac output, VCO2 and R (the respiratory exchange ratio VCO2/VO2). Recovered shunt estimates were then compared with their ‘true’ shunt counterparts.

2.3 ML training

[See ‘Machine (Deep) Learning terminology’ and ‘Deep Learning Model—more detail’ in Supplementary Material].

The ‘Keras’ library [16, 17] was used to construct deep learning models, each with an input layer of 10 features, plus densely connected intermediate layers and a final layer with one unit. The ReLU activation function was utilised. The optimizer was RMSprop, and the loss function was MSE (mean squared error).

The dataset was shuffled, and input features normalized. Overfitting could not be induced with increased numbers of layers, units/layer, and training epochs, consistent with a simulated dataset free from observational errors and from the inherent variation of natural phenomena.

We selected a model with six densely connected intermediate layers and 128 units/layer, resulting in 84,097 trainable parameters. The model was trained to predict shunt using 500 epochs of a dataset of 14,736 physiological records.

2.4 Sensitivity analyses

We assumed FiO2 accuracy and determined sensitivities to the remaining nine ML inputs. A representative scenario was selected in which both shunt and low V/Q blood flow contribute to a significant gas exchange deficit, as summarized by the VenAd value. The nine monitoring inputs could then be varied individually above and below their ‘true’ values while effects on VA and shunt (and therefore low V/Q, defined as VenAd—shunt) were tracked in graphic format.

This exercise was conducted using a direct (non-ML) method of model back-calculation (described in the Supplementary Material). Corresponding VenAd calculations to accompany shunt estimates were performed by application of Eq. 10 in the Supplementary Material.

2.5 Statistical methods

The data set consisted of binary pairs of actual and estimated data for shunt. To demonstrate the fit of the model, univariate regression was undertaken to assess the relationship between each estimated result and its actual partner, using the actual partner as the dependent variable. Results were reported as the regression slope, regression constant and the coefficient of determination, the R2 value. With perfect agreement between actual and estimated values, these parameters would be 1.00, 0.00 and 1.00 respectively. The 95% confidence interval and associated p-value were also reported.

Deviations of estimated from actual values were calculated and plotted as ‘estimate – actual’. Mean (SD) and median (IQR) were summarized with the data range.

A kernel density estimate (KDE) plot was also constructed. This enabled a visual representation of both the underlying distribution of the data and the accuracy of the fit of the predicted shunt values.

The level of significance was set at α < 0.05 for all relevant analyses. The statistical analysis and associated plots were undertaken using STATA™ (ver 17.0).

3 Results

Linear regression and error calculation results are set out in Tables 3 and 4, while relationships between true and estimated shunt values are illustrated graphically in Fig. 2 by KDE and error plots, along with a plot of true shunt versus estimates.

Table 3 Univariate linear regression for shunt. Sample size was 500
Table 4 Error calculation (actual – estimate)
Fig. 2
figure 2

Shunt estimates versus true values. Three subplots share the same X-axis scale. The kernel density estimate (KDE) plot (upper graph) illustrates the distribution of observations for the independent variable along with goodness of fit. The Y-axis in the KDE plot is dimensionless. The solid line (true shunt values) and the dashed line (shunt estimates) are closely aligned. Close agreement, slightly reduced at Shunt ≤ 15%, is evident in the error plot (middle graph), and in the plot of true shunt versus shunt estimates (lower graph)

Close agreement is demonstrated. The calculated error although small, is slightly right skewed (Skewness =  + 0.53) and not normally distributed (Shapiro–Wilk p < 0.001), with most occurring at Shunt < 15%. The plot of actual shunt versus shunt estimates confirms overall close agreement, while best approximating the line of identity when Shunt ≥ 15% (Fig. 2). These findings are also reflected in the adjacent error plot.

3.1 Sensitivity analysis

The nine model inputs selected for the sensitivity evaluation were:

Hb 9.30 g/dL, pH 7.364, PaCO2 39.4 mm Hg, PaO2 69.0 mm Hg, SaO2 0.94,mean PACO2 30.1 mm Hg, cardiac output 5.25 L/min, VCO2 187 mL/min, R 0.74.

At FiO2 = 0.38, model calculations produced the following estimates which determine the shunt / low V/Q split:

VenAd 18.5%

Shunt 10.8%

Responses to individual input variations are illustrated in Figs. 3, 4, 5. PaO2 and pH variation had the smallest overall effects on shunt percentage and VenAd. Each of the remaining monitoring inputs produced distinct shifts in shunt percentage. Concurrent alterations in VenAd were also evident, except with variations in PaCO2 and mean PACO2.

Fig. 3
figure 3

Variation of mean PACO2 (mean alveolar PCO2) measurements above and below the ‘true’ value with corresponding venous admixture and shunt percentages

Fig. 4
figure 4

Variation of PaCO2, VCO2, cardiac output and R (respiratory exchange ratio) values above and below ‘true’ values with corresponding venous admixture and shunt percentages

Fig. 5
figure 5

Variation of PaO2, SaO2, pH and Hb above and below the ‘true’ values with corresponding effects on venous admixture and shunt percentages

4 Discussion

In a computer simulation we recovered individual shunt values from 500 diverse gas exchange scenarios using a ‘Deep Learning’ ML analysis of monitoring data. The input data consisted of the operating FiO2 plus nine ‘bedside’ monitoring measurements including blood gas and mean PACO2 values created from a 21- compartment model of pulmonary blood flow. Collected without FiO2 manipulation, these were sufficient for accurate shunt recovery. Therefore unlike the ALPE method [8, 9] or our Two-FiO2 method [1], it should be possible to dispense with FiO2 ‘switching’ using this approach. Since VenAd values are calculated directly from the same data, corresponding ‘low V/Q’ components can be reported simultaneously as ‘VenAd—shunt’.

The small error increase observed at shunt values ≤ 15% is possibly a reflection of abbreviated ML training in lower shunt scenarios. Reduced scenario numbers in this region (although also at shunt values ≥ 30%) are evident in the test set KDE plot (Fig. 2). Scenario distributions in the training set are similar (plot not shown).

As with the Two-FiO2 method [1], the requisite data elements for ML analysis were sourced from blood gas, indirect calorimetry, and cardiac output measurements, but with the addition of mean PACO2 measurements by volumetric capnography. Corresponding mean PACO2 values could be calculated from the model with relatively minor adjustments. This factor prompted our selection of mean PACO2 over end-tidal PCO2, notwithstanding the promise shown by the latter measurement in severity stratification of acute respiratory distress syndrome (ARDS) [18].

Single–FiO2 quantification of gas transfer, in other words quantification without FiO2 manipulation, can be performed rapidly at the operating FiO2. There are no re-equilibration intervals, and there should be no possibility of signal distortion from absorption atelectasis [13] and altered hypoxic pulmonary vasoconstriction [14]. However, Single—FiO2 indices in current use have significant drawbacks. The difficulty with the three-compartment lung model of Riley and Cournand [10, 11] was canvassed in the Introduction, while simpler Single—FiO2 indices such as the A-a gradient and the PaO2/ FiO2 (PF) ratio are inconsistent instruments [19] providing limited information on the underlying lung pathophysiology.

‘Scaled back MIGET’ approaches such as ALPE [8, 9] and our proposed ‘adapted West’ model [1] can provide physiological detail to assist physician decision making, the focus in this case being the separate quantification of shunt and low V/Q effects [8, 12]. More generally they represent an opportunity to improve severity and prognostic stratifications, for example in ARDS [20, 21]. To date their main downside has been a requirement for Two-FiO2 [1] or even Multi–FiO2 [8, 9] data input.

We have shown in this ‘reverse engineering’ simulation that the adapted West model can function using ML at a high level of accuracy in Single–FiO2 mode provided mean PACO2 values are included in the data mix. With the advent of volumetric capnography [15], this can be accomplished without collection of expired gas in Douglas bags. The only additional requirement was incorporation of N2 exchange in the model to enable more accurate estimates of mean PACO2.

Having both shunt and low V/Q estimates expressed as percentages of pulmonary blood flow can simplify interpretation of lung pathophysiology and thereby facilitate management decisions. As discussed previously [1], a severely reduced PF ratio in a patient with ARDS typically flags extensive lung consolidation causing a large right to left shunt. Under these circumstances practitioners may initiate recruitment maneuvers, after which positive end expiratory pressure (PEEP) settings are commonly reset to maintain recruited lung regions [19].

However, identical oxygenation disturbances can occur despite well—aerated lung with minimal true shunt. For example, in COVID -19 pneumonia there may be extensive pulmonary micro—thromboses diverting mixed venous blood flow through low V/Q compartments [12, 22]. In this latter scenario recruitment maneuvers and significant PEEP manipulations are likely to be counterproductive, the priority now being limitation of lung over-distension. Based on our findings, rapid distinction between these two scenarios should be possible by ML modelling of pulmonary blood flow distribution from data collected solely at the operating FiO2.

When upgrading from Two-FiO2 to Single-FiO2 mode, volumetric capnography could simply be added to the existing input mix. Alternatively, since VCO2 measurements would then be available from two devices, volumetric capnography could replace indirect calorimetry provided an appropriate respiratory exchange ratio (R = VCO2/VO2) was also assigned. For example, R = 0.90 was the apparent approximate mean R value in a recent report of patients with ARDS [8]. However, incorrect R values reduce accuracy (Fig. 4).

Sensitivity analyses raise other caveats. In Figs. 3, 4, 5 estimated shunt / low V/Q splits are shown to be variously sensitive to the nine bedside input values. In some cases, the VenAd calculation (which then defines low V/Q as ‘VenAd—shunt’) is also altered. Of note, mean PACO2 variability can cause significant shifts. Inspection of Fig. 3 shows that a 1 mm Hg variation above or below the ‘true’ mean PACO2 value moved the estimated shunt by approximately 2% of pulmonary blood flow up or down respectively. Sensitivity may vary in other gas exchange scenarios. For example, we observed during the evaluation that mean PACO2 sensitivity is reduced when true shunt dominates the VenAd, whereas it can exceed 3% / mm Hg when low VQ flow is the primary gas exchange abnormality.

To report mean PACO2 the volumetric capnograph scans single breath waveforms to determine the geometric mid-point of Phase 3 [23]. Precise identification of Phase 3 commencement is therefore key, but can be uncertain when there are waveform abnormalities [24]. Single breath waveforms are distorted by several factors known to impact end-tidal PCO2 measurement, which include tachypnea, bronchospasm, bronchial intubation, partial airway obstruction, and ventilator ‘malfunctioning’ [24].

Considering that there are another eight inputs in addition to mean PACO2, each with measurement and biological variation, the Single—FiO2 method if introduced clinically would ideally incorporate multiple data sampling across a rolling time segment, for example the prior five minutes. It would necessitate capturing mean PACO2 values breath—to—breath, along with streamed inputs of blood gases, cardiac output, VO2 and VCO2. Continuous shunt/low V/Q updates could then be produced in close to real time with relatively minor damping.

However the only continuous blood gas device in current use is confined to pump flow measurements during cardiopulmonary bypass [25]. Continuous optode-based analyzers are no longer available for general clinical use mainly due to problems with measurement artifact [26]. That being so, snapshot blood gases with the rest in rolling format would be ‘the next best thing’. Even with a complete downgrade to snapshot inputs across the board, shunt values could be estimated by ML using both Single-FiO2 and Two-FiO2 methods [1], affording practitioners three contemporaneous shunt/low V/Q split estimates by two alternative methods.

5 Conclusion

We conclude based on computer simulations of diverse gas exchange scenarios that data collected at a single FiO2 from blood gas, indirect calorimetry, cardiac output, and volumetric capnography measurements can be used to quantify pulmonary oxygenation deficits as percentage shunt flow (V/Q = 0) versus percentage low V/Q flow (V/Q > 0). ML analysis, a game-changer for the Two-FiO2 Adapted West method [1], is now shown to perform with similar accuracy in Single-FiO2 mode, so that rapid shunt/low VQ estimates can be performed at the operating FiO2 without impacting gas transfer.