Introduction

The rationale of dynamic liver function tests consists in using the hepatic elimination rate of a properly chosen test compound as a measure for the overall 'hepatic functional mass'. Liver damage may include dysfunction of subcellular organelles (e.g. endoplasmic reticulum and mitochondria) as well as a reduction in the number of metabolically active hepatocytes, both processes affecting the CYP-450-dependent detoxification rate of a test drug. Breath tests, in particular, share the principle that a subject is administered an exogenous test compound (e.g. aminopyrine, galactose, phenylalanine or methacetin) in which the common 12C atom of a functional group has been replaced by the stable 13C isotope. In the liver, 13CO2 is enzymatically cleaved from the functional group, typically by an enzyme of the CYP-450 family, enters the plasma and is then expired. As about 1% of the natural carbon atoms of our body compounds are present as the stable isotope 13C, the abundance of the excess 13CO2 in the breath is quantified as delta-over-baseline (DOB) defined as relative concentration difference ([13CO2]–[13CO2]baseline)/[13CO2]baseline given in per mille (‰).

Breath tests are attractive for being less invasive, relatively simple and having a high patient acceptance (Bonfrate et al. 2015; Braden et al. 2007). Amongst the various substrates utilized to evaluate quantitative liver function, the 13C-methacetin breath test (MBT) has shown to be most promising (Buechter et al. 2018; Klatt et al. 1997). In its original design, the MBT is performed by oral administration of the test drug and subsequent measurement of the cumulative percentage of the administered dose of 13CO2 recovered in the breath over time. Stravitz et al. (2015) reported that the oral MBT was superior to the MELD score in predicting the risk of cirrhotic complications and mortality in patients listed for liver transplantation. The oral 13C-MBT was also successfully evaluated in the assessment of acute liver injury in a rat model (Zhu et al. 2014).

As individual differences in the intestinal uptake of the test drug may distort the test results, an improved variant of the MBT test (LiMAx test) has been developed at the Charite Berlin based on the intravenous administration of the drug and quasi-continuous monitoring of 13CO2 in the breath by means of an ultra-sensitive laser system (Stockmann et al. 2009). The LiMAx test has been successfully validated as a valuable non-invasive test in liver surgery for the pre- and postoperative assessment of organ function (Stockmann et al. 2009; Rubin et al. 2016) and for classifying the risk of cirrhotic complications and mortality in patients evaluated for liver transplantation (Jara et al. 2015). Moreover, in a retrospective study encompassing 102 patients with chronic liver disease who underwent a liver biopsy, the LiMAx test performed better than transient elastography (TE) and several serum biomarkers in detecting different stages of liver fibrosis and cirrhosis (Buechter et al. 2018). Nevertheless, the capability of the LiMAx test to reliably discriminate between a non-fibrotic (F0) and mildly-fibrotic (F1) liver was equally poor than observed with the oral version of the MBT (Dinesen et al. 2008; Razlan et al. 2011). Therefore, to become accepted as a reliable clinical tool for the early detection of the onset loss of the liver's functional parenchyma, it requires further improvement of the LiMAx test.

A detailed discussion of various confounding factors limiting the accuracy of the MBT is given in Gorowska-Kowolik et al. (2017). In brief, there are two different categories of confounding factors diminishing the reliability of functional breath tests. First, even if the breath test was able to provide a precise information of the detoxification rate of the test drug, the utility of this parameter for the distinction between normal and diseased livers is restricted by several confounding factors such as the intake of medical drugs, smoking, aging or genetic variability which all may influence the expression level of the targeted metabolic pathway in the liver. Second, the kinetics of 13CO2 in the exhaled breath does not truly reflect the kinetics of enzymatic 13CO2 formation in the liver. This is a consequence of the systemic distribution of 13CO2: A certain fraction of newly formed 13CO2 is taken up into several body compartments from which it returns with delay to the plasma. Earlier studies on the systemic CO2 dynamics in humans (Barstow et al. 1989, 1990; Irving et al. 1983) revealed large variations in the rate of irreversible CO2 elimination and CO2 exchange with body compartments. Hence, depending on the size of the transiently trapped 13CO2 fraction in a specific subject, the DOB values may either over- or underestimate the actual hepatic formation rate of 13CO2. This fact is usually disregarded when equating the DOB value with the formation rate of 13CO2 in the liver.

To correct for the ‘CO2 bias’, we recently proposed a novel test variant (named “2DOB”) which is initialized with the injection of a standard dose of 13C-labeled bicarbonate (yielding information on the actual systemic CO2 kinetics in the body of the individual) followed by injection of the 13C-labeled test drug just as in the conventional breath test (Holzhutter et al. 2013). Computer simulations suggested that the predictive power of the proposed 2DOB breath test to reliably assess the CYP-specific hepatic detoxification activity should be significantly higher compared to the conventional breath test. Here we report on a first preliminary study with 38 subjects on the practicability and utility of the proposed 2DOB test.

Methods

Subjects

16 patients with different types of liver pathologies (10 males, 6 females) and 22 normal subjects (14 males, 8 females) with no history or biochemical signs of liver disease, were enrolled in this study (for detailed information see Supplementary Table 1). In all except one patient, the diagnosis of the liver disease was histologically proven by histologic evaluation of either needle biopsy or surgical specimen. From healthy controls, a panel of laboratory parameters (aminotransferases, γ-glutamyltransferase, coagulation factors, bilirubin, alkaline phosphatase and albumin) was determined to exclude presence of a liver disease.

Experimental protocols

The study was carried out at the Department of Surgery, Charité Campus Mitte and Campus Virchow-Klinikum, Charité-Universitätsmedizin Berlin from July 2016 to February 2018. The study protocol was approved by the ethics committee of the Charité-Universitätsmedizin Berlin and adhered to the latest version of the Declaration of Helsinki, and all the study participants provided written informed consent before enrollment. Every subject underwent two examination sessions, held on separate days.

2DOB test

To avoid isotopic effects from the food intake (Jonderko et al. 2008), the test was started after at least 3 h of fasting. The subject was placed in a prone position. To analyze the whole breath continuously, a specifically designed face mask was used for easy collection of exhaled breath. The gas analyses were performed in real time at the bedside using the commercially available FLIP® detection device (Humedics GmbH, Berlin, Germany) with a newly developed device for spectral analysis of the 13CO2 and 12CO2 amounts in the exhaled breath (Rubin et al. 2011, 2016). The intravenous injection of 13C-labeled bicarbonate (named 13C–B in the following) was performed after about 10 min when the DOB baseline has reached stable values. This injection resulted in a rapid increase of DOB values reaching a maximum after a few seconds. For the model-based computational analyses of the experimentally determined DOB data, the time point at which the DOB value of an individual patient has reached the maximum after injection of 13C–B was designated "0". Then, at time T = Tm ≈ 30 min after the injection of 13C–B, a dose of 2 mg/kg body weight of 13C-labeled methacetin (13C–M) was administered intravenously followed by a flush of 20 ml of isotonic saline solution. To keep the duration of the full test as short as possible, a time span of 30 min for phase 1 turned out to be sufficient for the unequivocal estimation of the parameters k−C, k+C and kR. Administration of 13C–M gave rise to a second increase of the DOB values followed by a more or less steep decline depending on the methacetin metabolizing capacity of the liver (see Fig. 2). Online sampling and analysis of the 13CO2/12CO2 ratio in the breath were performed with a time resolution of 25 s. To maintain a constant hemodynamic state, test subjects had to rest in a supine position throughout the test.

A compartment model (see below) was used to estimate from the experimentally determined time-dependent DOB data numerical values for the kinetic parameters characterizing the chemical conversion rate of 13C–M, the exchange of 13C–B and 13C–M with body compartments and the irreversible elimination of 13C–B. These parameters were then tested for their ability to serve as classifiers for the discrimination between normal and diseased livers (see below).

LiMAx test

The LiMAx test is basically identical with the second phase of the 2DOB test. After calibration of the DOB baseline curve, the test is initiated with the intravenous injection of a dose of 2 mg/kg body weight of 13C–M. Blood samples were taken from eight subjects at six time points (= 60, 120, 300, 600, 1200, 1800 s) after 13C–M injection for the determination of the plasma values of 13C–M. The classification score of the test, LiMAx, was introduced in (Stockmann et al. 2009) as

$$ {\text{LiMAx}} = \frac{{{\text{DOB}}_{\max } \; \times \;{\mkern 1mu} {\text{R}}\left( {{\text{PDB}}} \right){\mkern 1mu} \; \times \;{\mkern 1mu} {\mkern 1mu} P{\mkern 1mu} \; \times \;{\mkern 1mu} {\mkern 1mu} {\text{BSA}}\; \times \;{\mkern 1mu} {\text{MM}}}}{{{\text{BW}}}}, $$
(1)

where DOBmax is the maximum value of DOB, R(PDB) = 0.011237 is the baseline ratio 13CO2/12CO2 according to the Pee Dee Belemnite standard (Craig 1957), P = 300 mmol/h/m2 is the CO2 production rate per m2 body surface, BSA is the body surface area (in m2), MM = 166 is the molecular mass of the test drug 13C-methacetin and BW denotes the body weight. BSA was computed by the formula \({\text{BSA}} = 0.024256 \times {\text{BW}}^{0.5378} \times H^{0.3964\;} \left[ {{\text{m}}^{2} } \right]\), where H is the height [in m] of the subject (Haycock et al. 1978).

The data of the LiMAx test were also analyzed by means of the reduced compartment model (represented by the non-shaded part in the reaction scheme shown in Fig. 2) in which the kinetic parameters for the exchange of CO2/bicarbonate with body compartments were not included (see Fig. 1).

Fig. 1
figure 1

Schematic representation of the procedures of the 2DOB test and the LiMAx test

Targeted multiple reaction monitoring (MRM) of methacetine and acetaminophene in the plasma

Pure metabolites were dissolved to a final concentration of 1 μM in MeOH and 0.1% formic acid, and injected by a syringe (7 μL/min) into the triple quadrupole hybrid ion trap mass spectrometer (QTrap 6500, Sciex, Framingham, MA). Precursor ions were fragmented in positive and negative electrospray ionization (ESI) modes, and the most intense fragment peaks were chosen and optimized for the following parameters: declustering potential (DP) for precursor ions, collision energy (CE), and collision cell exit potential (CXP) for fragment ions. Transitions were monitored and acquired at unit resolution (peak width at 50% was 0.7 ± 0.1 Da tolerance) in quadrupole Q1 and Q3. Metabolites were separated on a Reprosil-PUR C18-AQ (1.9 μm, 120 Å, 150 × 2 mm ID; Dr. Maisch; Ammerbuch, Germany) column at a controlled temperature of 30 °C, the four best transitions were chosen and all MRM instrument settings are given in Table 1.

Table 1 Statistical measures of the predictive power of the LiMAx test and parameters of the 2DOB test

For the standard curve, 50 μl of plasma was mixed with 20 μl of a dilution series of acetaminophen (paracetamol) and methacetin of following concentrations (800 nM, 200 nM, 50 nM, 12.5 nM, 3.125 nM, 781.25 pM, 195.31 pM, 48.83 pM, 12.21 pM, 3.05 pM) and 20 μl of internal standard. 180 μL of acetonitrile was added to precipitate plasma proteins, vortexed for 20 s and incubated at room temperature for 10 min. Thereafter, the mixture was centrifuged for at 16,000 rcf for 10 min. 100 μL of the supernatant was taken and mixed with 44.4 μL of HPLC grade water. 5 μL of the mixture was finally injected into the LC–MS system. The standard curve was re-measured after the acquisition of two complete time series. The standard curve was acquired interlaced with the measurement of samples.

Blood samples were handled as described above with the following modification: 50 μL of plasma was mixed with 20 μL of internal standard and 20 μL of HPLC grade water. Each sample was injected in triplicates.

The samples (5 µL) were injected and compounds were separated on a LC instrument (1290 series UHPLC; Agilent, Santa Clara, CA), coupled online to a triple quadrupole hybrid ion trap mass spectrometer QTrap 6500 (Sciex). Data acquisition was performed with an ion spray voltage of 5.5 kV in the positive mode for the electro spray ionization source, nitrogen as the collision gas was set to medium, the curtain gas was at 30 psi, the ion source gas 1 and 2 was at 50 and 70 psi, respectively. The interface heater temperature was set to 350 °C.

The LC buffer compositions were as follows: (A) 10 mM ammonium acetate in LC–MS grade H2O (adjusted with acetic acid to pH 3.5), (B) LC–MS grade acetonitrile with 0.1% formic acid. The 10 min gradient was set as follows: 2% B at minutes 0–1, linearly increasing to 45% until minute 5, 98% between minutes 6 and 7, and again equilibrated at 2% between minutes 7.1–10.

Metabolite identification was based on three levels: (i) the correct retention time, (ii) four transitions, (iii) and matching MRM ion ratios of tuned pure metabolites as described previously (Lau and Ahmad 2013). Peak integration was performed using MultiQuantTM software v.2.1.1 (Sciex). All peaks were reviewed manually and adjusted if necessary. The peak area of the first transition per metabolite was used for subsequent calculations. Peak integrations were reviewed manually. An internal standard (13C915N l-phenylalanine) was used to normalize all LC–MS runs for instrumental variations. For the calculation of absolute concentrations, the averaged intensities of three replicates (samples and standard curve) and the two nearest values of the standard curve were taken and calculated as a linear interpolation of values on a log–log plot.

Compartment modeling of systemic 13CO2 and 13C-methacetin kinetics

We used the compartment model shown in Fig. 2 to describe the kinetics of 13C-methacetin (abbreviated with 13M) and the additionally formed H13CO3 + 13CO2 (abbreviated with Δ13C) resulting from the administration of 13C-labeled bicarbonate and enzymatic conversion of 13C-methacetin. Combining 13CO2 and H13CO3 into one variable is backed up by the fact that both entities are in quasi-equilibrium due to the fast reaction catalyzed by the enzyme carboanhydrase. The model takes into account the exchange of 13CO2 + H13CO3 and 13C-methacetin between the blood and other body compartments (denoted with X and Y, respectively). Earlier studies of Barstow et al. (1985, 1990) and Irving et al. (1983) on the 13C-bicarbonate kinetics in humans have provided evidence for the existence of at least three CO2 exchange compartments differing in the characteristic time constants for the exchange kinetics. Over shorter time ranges of about 15–30 min as considered in our test, the error made by neglecting the slow exchange processes should remain sufficiently small.

Fig. 2
figure 2

Scheme of the compartment model. The variable Δ13C denotes the sum of additionally (above baseline!) formed 13CO2 and H13CO3, 13M denotes 13C-methacetin. Both metabolites can be exchanged between the plasma and other body compartments lumped together into single compartments (X and Y, respectively). Uptake of 13C-methacetin into the liver, conversion to the reaction products acetaminophen and 13CO2 by CYP1A2, and release of the enzymatically formed 13CO2 into the blood plasma are described by an overall reaction with rate constant kL. Phase 1 start with the intravenous administration von 13C-labeled bicarbonate. Phase 2 starts about 30 min later with the intravenous administration von 13C-labeled methacetin (see test protocols)

The reaction scheme shown in Fig. 2 differs from the reaction scheme used in our previous work (Holzhutter et al. 2013) in the following items: (i) the reversible exchange of the drug with non-hepatic compartments is now taken into account, (ii) the kinetics of the reaction product acetaminophen is not considered in the model as the plasma concentration of this drug was not monitored, (iii) exchange of 13C-methacetin with the liver, conversion to the reaction products acetaminophen and 13CO2 and release of the formed 13CO2 into the blood plasma are lumped together into a single process with rate constant kL. The latter simplification was necessary as the reversible exchange of 13C-methacetin with the liver and with extra-hepatic organs cannot be discerned in the shape and magnitude of the DOB curve. Note that the parameter kR represents the total elimination rate of methacetin from the plasma, including the release of the drug from the plasma into the breath as well as other modes of irreversible loss as, for example, renal excretion and slow covalent fixation in organic molecules.

Kinetic equations

The reaction scheme shown in Fig. 1 is governed by the following set of ordinary differential equations:

$$ \begin{gathered} \frac{{{\text{d}}\;^{13} {\text{M}}_{{\text{B}}} }}{dt} = k_{{ + {\text{M}}}} \,^{13} {\text{M}}_{{\text{Y}}}^{*} - \left( {k_{{ - {\text{M}}}} + k_{{\text{L}}} } \right)\;^{13} {\text{M}}_{{\text{B}}} , \hfill \\ \frac{{{\text{d}}\;^{13} {\text{M}}_{{\text{Y}}}^{*} }}{dt} = - k_{{ + {\text{M}}}} \,^{13} {\text{M}}_{{\text{Y}}}^{*} + k_{{ - {\text{M}}}} \,^{13} {\text{M}}_{{\text{B}}} , \hfill \\ \frac{{{\text{d}}\Delta^{13} {\text{C}}_{{\text{B}}} }}{dt} = k_{{\text{L}}} \,^{13} {\text{M}}_{{\text{B}}} + k_{{ + {\text{C}}}} \;\Delta^{13} {\text{C}}_{{\text{X}}}^{*} - \left( {k_{ - C} + k_{R} } \right)\,\;\Delta^{13} {\text{C}}_{{\text{B}}} , \hfill \\ \frac{{{\text{d}}\;\Delta^{13} {\text{C}}_{{\text{X}}}^{*} }}{{{\text{d}}t}} = - k_{{ + {\text{C}}}} \;\Delta^{13} {\text{C}}_{{\text{X}}}^{*} + k_{{ - {\text{C}}}} \;\Delta^{13} {\text{C}}_{{\text{B}}} . \hfill \\ \end{gathered} $$
(2)

For the meaning of the rate constants in equation system (2), see legend to Fig. 1. Note that the variables Δ13\(C_{{\text{X}}}^{*}\) and 13\(M_{{\text{Y}}}^{*}\) represent effective concentrations in the compartments X and Y that are related to the true concentrations \(C_{{\text{X}}}\) and MY by the relations \(C_{X} = \frac{{\Omega_{B} }}{{\Omega_{X} }}\;{\text{C}}_{{\text{X}}}^{*}\) and \({\text{M}}_{{\text{Y}}} = \frac{{\Omega_{B} }}{{\Omega_{Y} }}\;{\text{M}}_{{\text{Y}}}^{*}\) with \(\Omega\) denoting the (unknown) volume of the respective compartment. For the numerical integration of equation system (2) and estimation of numerical parameter see the Supplementary Information (Computational Details).

Definition of a scoring function for the quantification of hepatic detoxification capacity

Different combinations of model parameters were tried to define an appropriate scoring function that reliably quantifies the hepatic detoxification capacity. The power of a scoring function to discriminate between the detoxification capacities of healthy and diseased livers was evaluated by means of a ROC (receiver operator characteristic) curve analysis. The area under the ROC curve (AUC) is an effective and widely used method for evaluating the discriminating power of a diagnostic test or statistical model. Threshold values of model parameters used in the scoring function were chosen such that the Youden index, Y = TP (1-FP)–1 (TP—true positive rate = fraction of ill subjects correctly classified, FP—false positive rate = fraction of healthy subjects falsely classified) becomes maximum. Assessment of the statistical significance of the difference between any two classifiers based on the AUC was performed by a non-parametric test that accounts for the correlation of the ROC curves (Vergara et al. 2008). Sensitivity of classification results against random variations of the training set of healthy and diseased subjects was checked by means of Bootstrap resampling (Mossman 1995) based synthetic data sets created by random selection (with permitted repetition) of original data.

Results

Adjustment of the compartment model to measured DOB data

Both variants of the methacetin breath test, the conventional LiMAx test initiated by the injection of the test drug and subsequent monitoring of 13CO2 in the breath, and the proposed novel 2DOB test were carried out in 38 subjects (for test procedures see Fig. 1). Figure 3 shows typical DOB curves of 2DOB test obtained in a liver-healthy subject and a patient with liver cirrhosis. Experimental and computed DOB curves of all subjects are given in Fig. 1 of the Supplementary Information.

Fig. 3
figure 3

Patient-specific kinetics of 13C-labeled CO2 and methacetin in a healthy subject (HS) and a subject with liver cirrhosis (DS). a DOB curve data of HS (open circles, green line) and DS (filled circles, red line). The lines represent theoretical data obtained by fitting the compartment model to the measured DOB data. Over the full time course of the experiment, the DOB values were monitored every 25 s. After a calibration phase of about 600 s, at time "0" a dose of 2 mg/kg body weight of 13C-bicarbonate was injected intravenously. At time point TM (indicated by the arrow), a dose of 2 mg/kg body weight of 13C-methacetin was injected intravenously. b Simulated and measured time course of plasma methacetin of the normal subject (green curve, open circles) and the patient with liver cirrhosis (red curve, closed circles). c Simulated time course of additionally formed 13C-CO2/bicarbonate (Δ13CX*) in body compartment X. Note that relative concentrations are shown which are linearly related to the true concentrations by a scaling factor that depends on the unknown volume of compartment X. d Simulated time course of 13C-methacetin (13 MY*) in body compartment Y. Note that relative concentrations are shown which are linearly related to the true concentrations by a scaling factor that depends on the unknown volume of compartment Y. e Flux changes in DS compared to HS. Bold arrow: increased flux, dotted arrow: decreased flux

The DOB curve data were used to parameterize the compartmental model, i.e. to estimate numerical values for the model parameters (see Supplementary Information). Model fitting yielded numerical estimates for the rate constants k+C, k−C of Δ13C exchange with body compartment X, the rate constant kR of irreversible Δ13C elimination, the rate constants k+M, k−M of 13M exchange with body compartment Y and the rate constant kL for the hepatic detoxification of 13M. The two-step procedure applied for fitting subsets of model parameters separately to the first and second phase of the DOB curve was chosen as some parameters have a partially redundant influence on the height and shape of the DOB curve. For example, an increase of DOB values can be elicited by an increase of kL or a decrease of the CO2 elimination rate kR. The sensitivity analysis in the Supplement “Computational Details” illustrates that estimation of numerical parameter values from different parts of the 2DOB curve is necessary for unequivocal estimates of the parameters k+M and k−M.

From the DOB data of the LiMAX test, only for the four model parameters, kR, k+M, k−M and kL, numerical estimates could be derived from the data. Supplementary Table 1 depicts for each subject the numerical values of all model parameters.

Once parameterized, the model allows simulating the time course of Δ13C and 13M in the compartments X and Y, respectively. An example is shown in Fig. 3. Compared with the healthy subject (HS), the patient with cirrhosis (DS) displays a higher uptake rate of Δ13C into body compartment X after administration of an identical bolus of 13C-bicarbonate. Nevertheless, the rise of Δ13C in compartment X of DS is lower owing to the significantly lower detoxification rate of 13M and correspondingly lower liberation rate of Δ13C into the plasma (mirrored by the lower maximum of the DOB curve). Note that 40 min after administration of the test drug, a substantial portion of Δ13C is still retained in body compartment X. The model simulation predicts a faster decline of plasma 13 M in HS compared with DS. This is a consequence of DS having both a lower detoxification rate of 13M and a lower uptake rate of 13M into compartment Y (Fig. 2d, e). We validated the model predictions by the good concordance between the theoretical curves and measurements of 13M at various time points after 13M administration (Fig. 2b). It has to be mentioned that for subjects with very low values of the detoxification rate constant kL, there was a tendency of the computed time course of plasma methacetin to underestimate the true (= experimentally determined) clearance of 13M from the plasma (see Fig. 2, Supplementary Information). This discrepancy is due to the fact that with decreasing rise of the DOB curve after 13M administration, the two parameters k+M and k−M for the exchange of 13C-methacetion with compartment Y become less identifiable from the DOB curve (see Supplement “Computational details”, Parameter Identifiability). At the extreme, if there is no rise of the DOB curve at all, indicating that detoxification of the drug is absent (kL = 0), the numerical values of k+M and k−M cannot be determined as the reversible exchange of 13C-methacetin with compartment Y is not connected with the 13CO2 kinetics.

Influence of systemic CO2 kinetics

The values of the model parameters k+C, k−C and kR derived from the first part of the DOB curve provide information on the systemic CO2/HCO3 kinetics of the subject. The numerical estimates for the Δ13C elimination rate kR varied between 0.085 and 0.40/min, the mean value across the 38 subjects amounts to 0.22/min. For a subject with a normal acid–base status of 25 mM plasma CO2 and a blood volume of 5 L, this translates into an average diurnal elimination rate of about 38 mol/day. This value is above the known resting value of about 20 mol/day for the respiratory CO2 elimination rate (Jonderko et al. 2008). Hence, on the average, about one-half of the eliminated Δ13C must be rapidly taken up by a “fast” body compartment in a quasi-irreversible manner such that the slow return of this fraction to the plasma is not observable within the time span of the test. This is also reflected in an average 13CO2 recovery rate of 54% in the exhaled breath at T = 15 min after 13C-bicarbonate administration (for individual recovery rates see Fig. 4, Supplementary Information).

Notably, the distribution of values for the CO2 elimination rate kR in the group of patients with a liver disease exhibits a general shift towards small values (see Fig. 3, Supplementary Information). Six liver patients had kL values smaller than 0.17 min−1 (≈ 66% of the mean). Nevertheless, there was no statistically significant difference between the mean kL values in the group of healthy subjects (0.222/min) and diseased subjects (0.220/min) according to the two-sample t test (p = 0.77 at α = 0.05).

Figure 4 illustrates how individual variations of the systemic CO2 kinetics may influence the shape of the DOB curve at fixed 13C-methacetin kinetics. The curves were simulated with fixed values of the kinetic parameters kL, k+M, k−M determining the methacetin kinetics, but varying values of the kinetic parameter k+C, k−C, kR obtained for the 38 subjects of this study. The resulting curves differ substantially in characteristic curve parameters such as peak value, time-to-peak value, area under the curve and steepness of decline. For subject #33 with normal liver function, the peak values of the DOB curves (see Fig. 4a) may differ from the "true" peak value by more than a factor of 2. Extreme DOB curves would be produced if subject #33 had the CO2 kinetics of subject #23 (maximal positive deviation) or subject #28 (maximal negative deviation). The coefficient of variation (CV = standard deviation/mean) of peak values is 0.25. For subject #37 with liver cirrhosis taken as reference, the spread of DOB curves is generally lower than for the normal subject (CV of peak values = 0.14). For the majority of DOB curves, the peak values at T = 300 s. differ less than a factor of 1.5 from the reference curve.

Fig. 4
figure 4

Simulated DOB curves at fixed methacetin kinetics, but variable CO2 kinetics. a The kinetic parameters determining the methacetin kinetics were fixed at those values obtained for subject #33 with normal liver function: kL = 0.039 min−1, k+M  = 0.036 min−1, k−M  = 0.082 min−1. b The kinetic parameters determining the methacetin kinetics were fixed at those values obtained for subject #37 with liver cirrhosis: kL  = 0.010 min−1, k+M  = 0.131 min−1, k−M  = 0.06 min−1. The set of parameters, k+C, k−C, kR, determining the CO2 kinetics were put to the values obtained for the 38 subjects, i.e. each curve represents the DOB curve of a subject comprising the methacetin kinetics of subject #9 (a) or subject #37 (b) but the CO2 kinetics of one of the subjects 1–38. Note that the curves represent the second part of the 2DOB curves starting with the administration of 13C-methacetin, i.e. time "0" corresponds to time Tm of the full test curve. For a better comparison of curves, the residual DOB value at t =  Tm was subtracted so that the initial DOB values at time = 0 is zero for all curves. Blue curves: true DOB curve of the reference subject (subject #33 in A, subject #37 in b). Coefficient of variation (CV) of maximum DOB values: = 0.25 (a), 0.14 (b)

This suggests that the likelihood of overestimating the true drug-detoxifying capacity owing to variations of the CO2 kinetics is lower for a subject with a functionally impaired liver. The higher risk of over- or underestimating the 'true' hepatic detoxification capacity in subjects with normal or slightly impaired liver functionality compared with subjects with severe liver dysfunction is also reflected in the relationship between the detoxification rate constant kL and the parameter LiMAx serving as predictor of the detoxification capacity in the conventional LiMAx test and representing up to a scaling factor the peak value of the DOB curve (see Fig. 5). Despite the highly significant overall correlation of these two parameters, the largest deviations from the hyperbolic regression line occur in the middle range of kL values (grey-shaded area in Fig. 5) indicating either normal or mildly impaired detoxification capacity.

Fig. 5
figure 5

LiMAx versus model parameter kL (detoxification rate). LiMAx values were determined from the DOB curves of the conventional LiMAx test according to Eq. (1) (data given in Supplementary Table 1). Green and red dotes indicate data for controls and patients. The grey-shaded region highlights the range of largest deviations between kL and LiMAx. The dotted curve represents the best-fit hyperbolic relationship between LIMAX and kL: LIMAX = 744 × kL/(0.028 + kL)

The predictive capacity of model parameters as disease classifiers

Next we studied the suitability of model parameters for the binary classification of the subjects liver into normal (disease class "1") or impaired (disease class "2"). Obviously, the model parameter kL quantifying the hepatic detoxification rate of the test drug should serve as an appropriate classifier. Testing the predictive capacity of kL by means of a receiver operating characteristics (ROC) yielded a value of AUC = 0.85 for the area under the curve (AUC), a true positive rate (TP) of 0.75 and a true negative rate (TN) of 0.86. Thus, the predictive power achieved with the parameter kL was not better than the predictive power of the LiMAx score (AUC = 0.82, TP = 0.75, TN = 0.86). Notably, both kL and the LiMAx yielded false negative classifications for the same group of subjects. This liver suggests that a chronic liver disease must not be necessarily paralleled by a significantly lowered chemical conversion rate of the test drug.

We then tested whether other model parameters may serve as possible disease predictors. Intriguingly, parameters k−M describing the initial rate of methacetin uptake into the storage compartment Y yielded a surprisingly high quality of disease classification (AUC = 0.72). Parameter k+M describing the release rate of methacetin from compartment Y also yielded a statistically significant classification with AUC = 0.65. Hence the storage capacity of compartment Y for methacetin appears to be reduced in diseased livers. As the storage capacity of compartment Y depends on both k+M and k−M, we used the average concentration of methacetin stored in compartment Y over a time span TS = 3000 s. after administration of 13M at time t = TM, (see Fig. 6)

$$ M_{L} = \frac{1}{3000}\int\limits_{{T_{M} }}^{{T_{M} + T_{S} }} {^{13} M_{Y}^{*} \left( t \right)} \;dt, $$
(3)
Fig. 6
figure 6

Receiver operating characteristic to assess the predictive power of the 2DOD and LiMAx test. Black curve: LiMAx value used a classifier (AUC = 0.82). Blue curve: parameter kL used as classifier (AUC = 0.83). Green curve: parameter ML defined by equn (5) used as classifier (AUC = 0.81). Red curve: 2DOB score defined by equn (6) used as classifier (AUC = 0.95)

as a new disease classifier yielding an AUC value of 0.81. Remarkably, the ROC curve associated with the measure ML reached already a true positive rate of more than 90% at a false positive rate of about 30%, i.e. ML is predestined more than all other measures tested to reliably identify patients with liver disease.

We then examined whether a combination of the two disease predictors, kL and ML, may provide an even better classification than either parameter alone. As there are endless possibilities to combine the model parameters into a scoring function, we followed Occam’s razor command “not to multiply entities without necessity” (Schaffer 2015) and used the most simple (linear) combination of the tw model parameters kL and ML that individually provided the best classification results:

$$ 2{\text{DOB score}}\left( {k_{{\text{L}}} ,{\text{M}}_{{\text{L}}} ;k_{{{\text{Lc}}}} ,{\text{M}}_{{{\text{Lc}}}} } \right) = \left\{ {\begin{array}{*{20}c} {\frac{{M_{{\text{L}}} }}{{{\text{M}}_{{{\text{Lc}}}} }}} & {{\text{ if }}k_{{\text{L}}} \le k_{{{\text{Lc}}}} {\text{ and M}}_{{\text{L}}} \ge {\text{M}}_{{{\text{Lc}}}} } \\ {\frac{{k_{{\text{L}}} }}{{k_{{{\text{Lc}}}} }}} & {{\text{else}}} \\ \end{array} } \right.. $$
(4)

The 2DOB score defined in Eq. (4) is a continuous measure of the hepatic detoxifications capacity, whereby values smaller than unity indicate an impaired liver function. The 2DOB score is basically identical with the detoxification rate kL except for those cases where kL remains below the threshold value kLc (indicating an impaired detoxification rate) whereas the storage capacity ML is larger than the threshold value MLc. This definition of the 2DOB score implicates that the liver is classified as impaired if and only if both parameters, kL and ML, are smaller than the cut-off values kLc and MLc (see Fig. 7).

Fig. 7
figure 7

2DOB score as function of the normalized (parameters kLand ML. Parameters are normalized with respect to their cut-off values, i.e. kL/kLC ≥  1 means kL  ≥ kLC. The red lines enclose the region of score values that are smaller than unity thus indicating an impaired hepatic detoxification capacity. Note that according to definition (6) of the 2DOB score, small values of the detoxification rate kL below the threshold kLC yield nevertheless a 2DOB score larger than 1 (= liver healthy) if the value of the storage capacity exceed the threshold value (ML ≥ MLc)

With the 2DOB score used as classifier, indeed a significant improvement of the classification quality was achieved; the AUC2DOBscore = 0.95 is significantly larger than the AUCLiMAx = 0.82 (z = 1.9, p = 0.028).

The cut-off values kLc and MLc given in Table 1 were determined by maximizing the AUC value across the group of 38 subjects.

Relationship between the 2DOB score and the MELD score

As the 2DOB score is defined as a continuous measure of the hepatic detoxifications capacity, it was tempting to check whether the 2DOB score correlates with clinically applied parameters that are used in the clinics for assessing the severity of chronic liver diseases. Figure 8 reveals a high correlation between the 2DOB score and the MELD (Model for End-Stage Liver Diseases) core of the 38 test subjects (RSpearman = 0.92, p < 0.00001).

Fig. 8
figure 8

Relationship between the 2DOB score and the MELD score of the 38 test subjects. Green dots: normal subjects. Red dots: subjects with clinically diagnosed liver disease. The black line is intended to guide the eye and to underline the non-linear relationship. The vertical blue lines at 2DOB score = 1 mark the threshold between normal and impaired detoxification capacity

There exists a clear monotone and highly non-linear relationship between the 2DOB score and the MELD score. 2DOB scores smaller than 0.4 are consistently associated with MELD scores larger than 8. The only outlier occurred for subject #33 (black data point in Fig. 8) who has a clinically inconspicuous liver, correctly reflected by a 2DOB score > 1, but an ordinarily high MELD score of 14 due to the fact that this subject was under steady treatment with anticoagulants which are known to influence the MELD score. As shown in Fig. 8, the 2DOB score proves as a much more sensitive indicator of a beginning or moderate liver disease than the MELD score. That is no surprise as the MELD was developed and adapted as a prognostic tool in advanced liver diseases (Lau and Ahmad 2013).

Robustness of the 2DOB test and further test optimization

Age-differences between controls and patients

The subjects in the control group of our study were significantly younger (mean age = 42.4 years) than the subjects with liver disease (mean age = 55.1 years). Aging has been reported to cause moderate declines in the Phase I metabolism of certain drugs (Cieslak et al. 2016; Schmucker 2005; Schmucker and Wang 1980) and thus could have affected the classifications. Therefore, we split the group of the 22 control subjects into 2 sub-groups, comprising 11 healthy subjects aged from 22 to 49 (mean age = 30.7) and 11 healthy subjects aged from 50 to 59 (mean age = 54.5). The older control group matches approximately the age of the diseased subjects. There was a slight albeit statistically insignificant tendency towards higher LiMAx scores (from 428 to 458) and kL values (from 0.044 to 0.054), suggesting a marginal age-related increase of CYP1A2-dependent metabolization rates in normal livers. Interestingly, the mean value of the parameter ML dropped from 62.5 in the group of younger controls to 50.7 in the group of older controls, suggesting an age-dependent decline of the storage capacity of the liver-associated compartment Y.

The classifications performed using the two age-corrected control groups separately (see Table 3, Supplementary Information), yielded identical statistical measures for the 2DOB test. Of note, the cut-off values for kL and ML were identical irrespective of the control group used. Changes in the ROC statistics of the LiMAx test were also very little affected by the choice of the control group. However, the cut-off value separating normal and diseased livers was significantly affected (282 versus 363) when using either the younger or older controls as reference.

We also tested the robustness of LiMAx- and 2DOB-based classifications against random variations of the training set of healthy and diseased subjects using the method of Bootstrap resampling (Mossman 1995). The frequency distribution of AUC values shown in Fig. 9 was constructed by choosing randomly with replacing 38 subjects from the original set, i.e. the randomly sampled set contained some data sets more than once (overrepresented) but some data sets were missing. For this synthetic data set, values of AUC and the cut-off parameters were computed. This resampling was applied 1000 times. The Bootstrap variance of AUC and of the cut-off values are also given in the legend to Fig. 9. In less than 2% of trials, the AUC2DOB was below 0.8, the mean AUC2DOB was 0.94.

Fig. 9
figure 9

Variability of AUC values for the 2DOB and the LiMAXx. Variability was assessed by Bootstrap resampling (1000 trials). Green-shaded bars: LiMAx. White bars: 2DOB score

Regarding the safety and operational feasibility of the test protocol, there were no complications with any participant. Nevertheless, to keep the time interval for the investigation as short as possible, we checked whether the duration of the test can be reduced to about 30 min in total. To this end, we used only a part of the DOB data recorded for 15 min after administration of 13C-bicarbonate and for 15 min after administration of 13C-methacetin for the adjustment of the compartment model and estimation of kinetic parameters. With the short-time 2DOBtest, the 2DOB score yielded almost the same classification quality for the group of 38 subjects (AUC = 0.94, 4 misclassifications). The results of the 15 min test are given in Supplementary Table 1.

Discussion

The need for non-invasive tests capable of detecting early-stage liver diseases

One of the main difficulties with liver diseases is that patients often do not present symptoms or signs until the disease becomes advanced. Around 50% of patients receive their first diagnosis when they arrive in accident and emergency units, typically in their 40 s or 50 s, with jaundice, gastrointestinal bleeding, abdominal swelling and disorientation. Detection of ongoing liver diseases at an early stage is required to prevent or reduce further disease progression by lifestyle changes and pharmacological treatment. Non-invasive diagnostic techniques currently used are serum biomarkers and transient elastography (TE). However, serum biomarkers are not liver specific and TE results require an expert clinician for interpretation (European Association for Study of Liver and Asociacion Latinoamericana para el Estudio del Higado 2015). Regarding the use of breath tests for the detection of liver diseases, the elevation in the breath of naturally occurring volatile biomarkers such as limonene, methanol and 2-pentanone has been reported to reliably identify patients liver cirrhosis (Fernandez Del Rio et al. 2015). The same quality of discrimination between healthy and cirrhotic livers has been achieved with the LiMAx test (Buechter et al. 2018). However, differentiating between patients with and without cirrhosis alone has limited value, as this can be performed reliably with routine clinical methods. It is still a challenge to establish non-invasive liver functions tests that are sensitive and specific enough to detect the early onset of restrictions in the metabolic capacity of the liver. Taking for granted that the metabolization of drugs is a reliable indicator for the functional capacity of the liver as a whole, functional breath tests such as the LiMAx are promising test candidates provided that major confounding factors can be eliminated.

Exhaled 13CO2 as reliable indicator of hepatic detoxification capacity

Using the kinetics of exhaled 13CO2 produced during hepatic metabolization of a labeled test compound as indicator of the liver’s detoxification capacity inevitably raises the problem to what extend this signal is influenced by the systemic distribution of CO2. Measuring the rate of CO2 respiration doesn’t really solve this problem as this measure does not capture the exchange of plasma CO2 with other body compartments. Therefore, we designed a test variant that allows to assess how a defined bolus of 13CO2 is eliminated from the plasma of the patient and to exploit this information for a more reliable assessment of the drug detoxification rate. To this end, we used a fit-for-purpose compartment model that condenses the numerous physiological processes involved in the hepatic uptake and metabolization of the drug and the systemic distribution of CO2 into a manageable number of phenomenological parameters. The model provided an excellent fit to measured DOB curve data of all 38 subjects included in this study (see Fig. 1, Supplementary Information). Based on the high variability of the individual kinetic parameters for systemic CO2 kinetics (illustrated in Fig. 4), one may expect variations in the maximum of the DOB curve by a factor of 2 for normal subjects and a factor of 1.5 for subjects with restricted detoxification capacity.

The direct evaluation of the capability of the novel 2DOB breath test to provide reliable estimates of the hepatic drug detoxification rate would ultimately require measurements of arteriovenous drug concentration differences across the patient’s liver. As this has to be excluded for obvious reasons, an indirect evaluation may consist in testing the capability of the 2DOB test to correctly discriminate between liver healthy and clinically diagnosed liver-diseased subjects. Taking into account the patient-specific CO2 kinetics in the estimation of the hepatic conversion rate of 13C-methacetin (parameter kL) provided a modest improvement of the discrimination between normal versus impaired detoxification capacity compared with the conventional LiMAx test. This is plausible because the likelihood that the "true" maximum of the DOB curve (the basis for the definition of the LiMAx score) of patients with severe liver dysfunction is falsely shifted up to values within the normal range just due to an extreme CO2 kinetics remains small: The coefficient of variation for the exemplary case shown in Fig. 4b was 0.14. The same holds for the likelihood of a strong down shift of the DOB maximum for subjects with high detoxification capacity. However, for patients with detoxification capacities close to the borderline between normal and mild metabolic dysfunction, the correction of DOB values for the influence of the individual CO2 kinetics may be relevant. 7 out of 8 the cases misclassified by the LiMAx, but correctly classified by the 2DOB parameter kL are just in the middle range of kL and LiMAx values (see grey-shaded region if Fig. 5) where a marginal shift in the maximum of the DOB curve may turn the classification result into the opposite.

In summary, the already well-established LiMAx score appears to be robust against the bias caused by the systemic CO2 kinetics in subjects comprising either normal or drastically reduced hepatic detoxification capacities. However, in subjects with borderline capacities, the 2DOB test promises a significant improvement of the classification quality.

Plasma clearance of methacetin = metabolization and temporary storage

Generally, owing to its chemical similarity with acetaminophen, methacetin should rapidly and evenly distribute throughout most tissues and fluids as reported for acetaminophen (Forrest et al. 1982). Therefore, it is difficult to specify the volume of model compartment Y to convert the apparent rate constants into true rate constants. An intriguing finding of our study was that a substantial part of the test drug should be transiently stored in a body compartment that is closely associated with the liver. Distinction between two modes of plasma clearance of methacetin, reversible storage (without detoxification) and detoxification, was possible because the reversible exchange of the drug with the storage compartment influences the shape of the declining part of the DOB curve (see sensitivity analysis in Fig. 5 of the Supplementary Information). The decline of the DOB curve is influenced by both the uptake of 13CO2 from the plasma into other body and the uptake of 13C-methacetin into the exchange compartments. Hence, the share of the CO2 kinetics in the shape of the declining part of the DOB curve has to be known to estimate reliable values for the relevant parameters, k+M and k−M, determining the magnitude of the parameter ML. An issue with the correct estimation of ML occurs if the DOB curve is very flat, i.e. for livers with very low detoxification capacity kL (see Fig. 3b). Fortunately this is no impact on the 2DOB score because for very small values of kL the estimated value of ML is also small, i.e. ML<< MLc, so that the 2DOB score is identical with kL and the liver is classified as functionally impaired.

The physiological and anatomical nature of methacetin-storing body compartments which we condensed into the single model compartment Y remains elusive. Owing to the chemical similarity of methacetin with acetaminophen (APAP) we searched the literature for known facts about the pharmacokinetics of APAP. A recently published compartment model of APAP clearance from the plasma worked well without introducing a reversible exchange of APAP (Mian et al. 2019). Reversible binding of acetaminophen to plasma proteins like albumin (the plasma level of which can be reduced in severe liver failure) can be excluded as this is a very rapid process in contrast to the rather slow uptake and release according to our modeling data. With an association rate constant of about 5.8 × 104/M/s as determined for binding of tryptophane to albumin (Talbert et al. 2002) and an average albumin plasma concentration of 40 g/l = 600 μM, the apparent first-order rate constant for drug binding would be about 2000/s which is about five orders of magnitude larger than typical values of the model parameter k−M. Thus, it is tempting to assume that compartment Y is identical with the liver itself, whereby the transient storage without chemical conversion is either confined to a special fraction of hepatocytes or occurs in an intra-cellular compartment of all hepatocytes. Hepatocytes have cytosolic binding proteins (e.g. ligandin) that act as storage compartment for drugs and endogenous metabolites. Therefore, with loss of hepatocytes, there is loss of storage capacity. This explanation receives support from our observation that the storage capacity ML was lower in the group of older controls compared with the group of younger controls (see section “Robustness of the 2DOB test” below), possibly reflecting the age-dependent decline of active liver mass (Wakabayashi et al. 2002). As binding of the drug to cellular binding proteins will be similarly fast as binding to serum proteins, the rate constants for the reversible uptake to and release of from hepatocytes should reflect the transport rates across the plasma membrane (Austin et al. 2005). The capacity of the liver to efficiently remove drugs and other xenobiotics from the circulation may also be determined by its capacity to transiently store the drug before it can be chemically converted. This may in particular hold in situations when the detoxifying enzyme systems become saturated. Testing this hypothesis would mean to quantify the arteriovenous difference of labeled 13C-methacetin and 13CO2 in a laboratory animal. Anyway, storage of the test drug without immediate chemical conversion in the liver appears to be a mechanism that contributes to the rapid clearance of the drug from the plasma.

Conclusion

Individual variations in the systemic CO2 kinetics may have a significant influence on the parameters of the DOB curve. The novel test variant 2DOB takes this confounding effect into account and promises a significant improvement in the assessment of impaired hepatic detoxification capacity compared to the well-established LiMAx test in cases where the detoxification capacity is at the borderline between the normal and moderately reduced level. The suitability of the test for the reliable characterization of the natural history of chronic liver diseases (fatty liver → fibrosis → cirrhosis) has to be assessed in further studies. Validation of the test in an animal model would be perfect to remove remaining uncertainties of the model-based analysis by direct measurement of hepatic CYP activities and concentration profiles of 13CO2 and 13C-methacetin in different body compartments during the long-term progression of a liver disease (e.g. non-alcoholic fatty liver NAFLD).