A novel variant of the 13C-methacetin liver function breath test that eliminates the confounding effect of individual differences in systemic CO2 kinetics

The principle of dynamic liver function breath tests is founded on the administration of a 13C-labeled drug and subsequent monitoring of 13CO2 in the breath, quantified as time series delta over natural baseline 13CO2 (DOB) liberated from the drug during hepatic CYP-dependent detoxification. One confounding factor limiting the diagnostic value of such tests is that only a fraction of the liberated 13CO2 is immediately exhaled, while another fraction is taken up by body compartments from which it returns with delay to the plasma. The aims of this study were to establish a novel variant of the methacetin-based breath test LiMAx that allows to estimate and to eliminate the confounding effect of systemic 13CO2 distribution on the DOB curve and thus enables a more reliable assessment of the hepatic detoxification capacity compared with the conventional LiMAx test. We designed a new test variant (named "2DOB") consisting of two consecutive phases. Phase 1 is initiated by the intravenous administration of 13C-bicarbonate. Phase 2 starts about 30 min later with the intravenous administration of the 13C-labelled test drug. Using compartment modelling, the resulting 2-phasic DOB curve yields the rate constants for the irreversible elimination and the reversible exchange of plasma 13CO2 with body compartments (phase 1) and for the detoxification and exchange of the drug with body compartments (phase 2). We carried out the 2DOB test with the test drug 13C-methacetin in 16 subjects with chronic liver pathologies and 22 normal subjects, who also underwent the conventional LiMAx test. Individual differences in the systemic CO2 kinetics can lead to deviations up to a factor of 2 in the maximum of DOB curves (coefficient of variation CV ≈ 0.2) which, in particular, may hamper the discrimination between subjects with normal or mildly impaired detoxification capacities. The novel test revealed that a significant portion of the drug is not immediately metabolized, but transiently taken up into a storage compartment. Intriguingly, not only the hepatic detoxification rate but also the storage capacity of the drug, turned out to be indicative for a normal liver function. We thus used both parameters to define a scoring function which yielded an excellent disease classification (AUC = 0.95) and a high correlation with the MELD score (RSpearman = 0.92). The novel test variant 2DOB promises a significant improvement in the assessment of impaired hepatic detoxification capacity. The suitability of the test for the reliable characterization of the natural history of chronic liver diseases (fatty liver—fibrosis—cirrhosis) has to be assessed in further studies.


Introduction
The rationale of dynamic liver function tests consists in using the hepatic elimination rate of a properly chosen test compound as a measure for the overall 'hepatic functional mass'. Liver damage may include dysfunction of subcellular organelles (e.g. endoplasmic reticulum and mitochondria) as well as a reduction in the number of metabolically active hepatocytes, both processes affecting the CYP-450-dependent detoxification rate of a test drug. Breath Electronic supplementary material The online version of this article (https ://doi.org/10.1007/s0020 4-020-02654 -0) contains supplementary material, which is available to authorized users. tests, in particular, share the principle that a subject is administered an exogenous test compound (e.g. aminopyrine, galactose, phenylalanine or methacetin) in which the common 12 C atom of a functional group has been replaced by the stable 13 C isotope. In the liver, 13 CO 2 is enzymatically cleaved from the functional group, typically by an enzyme of the CYP-450 family, enters the plasma and is then expired. As about 1% of the natural carbon atoms of our body compounds are present as the stable isotope 13 C, the abundance of the excess 13 CO 2 in the breath is quantified as delta-over-baseline (DOB) defined as relative concentration difference ([ 13 CO 2 ]-[ 13 CO 2 ] baseline )/ [ 13 CO 2 ] baseline given in per mille (‰).
Breath tests are attractive for being less invasive, relatively simple and having a high patient acceptance (Bonfrate et al. 2015;Braden et al. 2007). Amongst the various substrates utilized to evaluate quantitative liver function, the 13 C-methacetin breath test (MBT) has shown to be most promising (Buechter et al. 2018;Klatt et al. 1997). In its original design, the MBT is performed by oral administration of the test drug and subsequent measurement of the cumulative percentage of the administered dose of 13 CO 2 recovered in the breath over time. Stravitz et al. (2015) reported that the oral MBT was superior to the MELD score in predicting the risk of cirrhotic complications and mortality in patients listed for liver transplantation. The oral 13C-MBT was also successfully evaluated in the assessment of acute liver injury in a rat model (Zhu et al. 2014).
As individual differences in the intestinal uptake of the test drug may distort the test results, an improved variant of the MBT test (LiMAx test) has been developed at the Charite Berlin based on the intravenous administration of the drug and quasi-continuous monitoring of 13 CO 2 in the breath by means of an ultra-sensitive laser system (Stockmann et al. 2009). The LiMAx test has been successfully validated as a valuable non-invasive test in liver surgery for the pre-and postoperative assessment of organ function (Stockmann et al. 2009;Rubin et al. 2016) and for classifying the risk of cirrhotic complications and mortality in patients evaluated for liver transplantation (Jara et al. 2015). Moreover, in a retrospective study encompassing 102 patients with chronic liver disease who underwent a liver biopsy, the LiMAx test performed better than transient elastography (TE) and several serum biomarkers in detecting different stages of liver fibrosis and cirrhosis (Buechter et al. 2018). Nevertheless, the capability of the LiMAx test to reliably discriminate between a non-fibrotic (F0) and mildly-fibrotic (F1) liver was equally poor than observed with the oral version of the MBT (Dinesen et al. 2008;Razlan et al. 2011). Therefore, to become accepted as a reliable clinical tool for the early detection of the onset loss of the liver's functional parenchyma, it requires further improvement of the LiMAx test.
A detailed discussion of various confounding factors limiting the accuracy of the MBT is given in Gorowska-Kowolik et al. (2017). In brief, there are two different categories of confounding factors diminishing the reliability of functional breath tests. First, even if the breath test was able to provide a precise information of the detoxification rate of the test drug, the utility of this parameter for the distinction between normal and diseased livers is restricted by several confounding factors such as the intake of medical drugs, smoking, aging or genetic variability which all may influence the expression level of the targeted metabolic pathway in the liver. Second, the kinetics of 13 CO 2 in the exhaled breath does not truly reflect the kinetics of enzymatic 13 CO 2 formation in the liver. This is a consequence of the systemic distribution of 13 CO 2 : A certain fraction of newly formed 13 CO 2 is taken up into several body compartments from which it returns with delay to the plasma. Earlier studies on the systemic CO 2 dynamics in humans (Barstow et al. 1989(Barstow et al. , 1990Irving et al. 1983) revealed large variations in the rate of irreversible CO 2 elimination and CO 2 exchange with body compartments. Hence, depending on the size of the transiently trapped 13 CO 2 fraction in a specific subject, the DOB values may either over-or underestimate the actual hepatic formation rate of 13 CO 2 . This fact is usually disregarded when equating the DOB value with the formation rate of 13 CO 2 in the liver.
To correct for the 'CO 2 bias', we recently proposed a novel test variant (named "2DOB") which is initialized with the injection of a standard dose of 13 C-labeled bicarbonate (yielding information on the actual systemic CO 2 kinetics in the body of the individual) followed by injection of the 13 C-labeled test drug just as in the conventional breath test (Holzhutter et al. 2013). Computer simulations suggested that the predictive power of the proposed 2DOB breath test to reliably assess the CYP-specific hepatic detoxification activity should be significantly higher compared to the conventional breath test. Here we report on a first preliminary study with 38 subjects on the practicability and utility of the proposed 2DOB test.

Subjects
16 patients with different types of liver pathologies (10 males, 6 females) and 22 normal subjects (14 males, 8 females) with no history or biochemical signs of liver disease, were enrolled in this study (for detailed information see Supplementary Table 1). In all except one patient, the diagnosis of the liver disease was histologically proven by histologic evaluation of either needle biopsy or surgical specimen. From healthy controls, a panel of laboratory 1 3 parameters (aminotransferases, γ-glutamyltransferase, coagulation factors, bilirubin, alkaline phosphatase and albumin) was determined to exclude presence of a liver disease.

Experimental protocols
The study was carried out at the Department of Surgery, Charité Campus Mitte and Campus Virchow-Klinikum, Charité-Universitätsmedizin Berlin from July 2016 to February 2018. The study protocol was approved by the ethics committee of the Charité-Universitätsmedizin Berlin and adhered to the latest version of the Declaration of Helsinki, and all the study participants provided written informed consent before enrollment. Every subject underwent two examination sessions, held on separate days.

2DOB test
To avoid isotopic effects from the food intake (Jonderko et al. 2008), the test was started after at least 3 h of fasting. The subject was placed in a prone position. To analyze the whole breath continuously, a specifically designed face mask was used for easy collection of exhaled breath. The gas analyses were performed in real time at the bedside using the commercially available FLIP ® detection device (Humedics GmbH, Berlin, Germany) with a newly developed device for spectral analysis of the 13 CO 2 and 12 CO 2 amounts in the exhaled breath (Rubin et al. 2011(Rubin et al. , 2016. The intravenous injection of 13 C-labeled bicarbonate (named 13 C-B in the following) was performed after about 10 min when the DOB baseline has reached stable values. This injection resulted in a rapid increase of DOB values reaching a maximum after a few seconds. For the model-based computational analyses of the experimentally determined DOB data, the time point at which the DOB value of an individual patient has reached the maximum after injection of 13 C-B was designated "0". Then, at time T = T m ≈ 30 min after the injection of 13 C-B, a dose of 2 mg/kg body weight of 13 C-labeled methacetin ( 13 C-M) was administered intravenously followed by a flush of 20 ml of isotonic saline solution. To keep the duration of the full test as short as possible, a time span of 30 min for phase 1 turned out to be sufficient for the unequivocal estimation of the parameters k −C , k +C and k R . Administration of 13 C-M gave rise to a second increase of the DOB values followed by a more or less steep decline depending on the methacetin metabolizing capacity of the liver (see Fig. 2). Online sampling and analysis of the 13 CO 2 / 12 CO 2 ratio in the breath were performed with a time resolution of 25 s. To maintain a constant hemodynamic state, test subjects had to rest in a supine position throughout the test.
A compartment model (see below) was used to estimate from the experimentally determined time-dependent DOB data numerical values for the kinetic parameters characterizing the chemical conversion rate of 13 C-M, the exchange of 13 C-B and 13 C-M with body compartments and the irreversible elimination of 13 C-B. These parameters were then tested for their ability to serve as classifiers for the discrimination between normal and diseased livers (see below).

LiMAx test
The LiMAx test is basically identical with the second phase of the 2DOB test. After calibration of the DOB baseline curve, the test is initiated with the intravenous injection of a dose of 2 mg/kg body weight of 13 C-M. Blood samples were taken from eight subjects at six time points (= 60, 120, 300, 600, 1200, 1800 s) after 13 C-M injection for the determination of the plasma values of 13 C-M. The classification score of the test, LiMAx, was introduced in (Stockmann et al. 2009) as where DOB max is the maximum value of DOB, R(PDB) = 0.011237 is the baseline ratio 13 CO 2 / 12 CO 2 according to the Pee Dee Belemnite standard (Craig 1957), P = 300 mmol/h/m 2 is the CO 2 production rate per m 2 body surface, BSA is the body surface area (in m 2 ), MM = 166 is the molecular mass of the test drug 13 C-methacetin and BW denotes the body weight. BSA was computed by the formula BSA = 0.024256 × BW 0.5378 × H 0.3964 m 2 , where H is the height [in m] of the subject (Haycock et al. 1978).
The data of the LiMAx test were also analyzed by means of the reduced compartment model (represented by the nonshaded part in the reaction scheme shown in Fig. 2) in which the kinetic parameters for the exchange of CO 2 /bicarbonate with body compartments were not included (see Fig. 1).

Targeted multiple reaction monitoring (MRM) of methacetine and acetaminophene in the plasma
Pure metabolites were dissolved to a final concentration of 1 μM in MeOH and 0.1% formic acid, and injected by a syringe (7 μL/min) into the triple quadrupole hybrid ion trap mass spectrometer (QTrap 6500, Sciex, Framingham, MA). Precursor ions were fragmented in positive and negative electrospray ionization (ESI) modes, and the most intense fragment peaks were chosen and optimized for the following parameters: declustering potential (DP) for precursor ions, collision energy (CE), and collision cell exit potential (CXP) for fragment ions. Transitions were monitored and acquired at unit resolution (peak width at 50% was 0.7 ± 0.1 Da tolerance) in quadrupole Q1 and Q3. Metabolites were separated on a Reprosil-PUR C18-AQ (1.9 μm, 120 Å, 150 × 2 mm ID; (1) Dr. Maisch; Ammerbuch, Germany) column at a controlled temperature of 30 °C, the four best transitions were chosen and all MRM instrument settings are given in Table 1. For the standard curve, 50 μl of plasma was mixed with 20 μl of a dilution series of acetaminophen (paracetamol) and methacetin of following concentrations (800 nM, 200 nM, 50 nM, 12.5 nM, 3.125 nM, 781.25 pM, 195.31 pM, 48.83 pM, 12.21 pM, 3.05 pM) and 20 μl of internal standard. 180 μL of acetonitrile was added to precipitate plasma proteins, vortexed for 20 s and incubated at room temperature for 10 min. Thereafter, the mixture was centrifuged for at 16,000 rcf for 10 min. 100 μL of the supernatant was taken and mixed with 44.4 μL of HPLC grade water. 5 μL of the mixture was finally injected into the LC-MS system. The standard curve was re-measured after the acquisition of two complete time series. The standard curve was acquired interlaced with the measurement of samples.
Blood samples were handled as described above with the following modification: 50 μL of plasma was mixed with  Each sample was injected in triplicates. The samples (5 µL) were injected and compounds were separated on a LC instrument (1290 series UHPLC; Agilent, Santa Clara, CA), coupled online to a triple quadrupole hybrid ion trap mass spectrometer QTrap 6500 (Sciex). Data acquisition was performed with an ion spray voltage of 5.5 kV in the positive mode for the electro spray ionization source, nitrogen as the collision gas was set to medium, the curtain gas was at 30 psi, the ion source gas 1 and 2 was at 50 and 70 psi, respectively. The interface heater temperature was set to 350 °C.
The LC buffer compositions were as follows: (A) 10 mM ammonium acetate in LC-MS grade H 2 O (adjusted with acetic acid to pH 3.5), (B) LC-MS grade acetonitrile with 0.1% formic acid. The 10 min gradient was set as follows: 2% B at minutes 0-1, linearly increasing to 45% until minute 5, 98% between minutes 6 and 7, and again equilibrated at 2% between minutes 7.1-10.
Metabolite identification was based on three levels: (i) the correct retention time, (ii) four transitions, (iii) and matching MRM ion ratios of tuned pure metabolites as described previously (Lau and Ahmad 2013). Peak integration was performed using MultiQuantTM software v.2.1.1 (Sciex). All peaks were reviewed manually and adjusted if necessary. The peak area of the first transition per metabolite was used for subsequent calculations. Peak integrations were reviewed manually. An internal standard ( 13 C 9 15 N l-phenylalanine) was used to normalize all LC-MS runs for instrumental variations. For the calculation of absolute concentrations, the averaged intensities of three replicates (samples and standard curve) and the two nearest values of the standard curve were taken and calculated as a linear interpolation of values on a log-log plot.

Compartment modeling of systemic 13 CO 2 and 13 C-methacetin kinetics
We used the compartment model shown in Fig. 2 to describe the kinetics of 13 C-methacetin (abbreviated with 13 M) and the additionally formed H 13 CO 3 + 13 CO 2 (abbreviated with Δ 13 C) resulting from the administration of 13 C-labeled bicarbonate and enzymatic conversion of 13 C-methacetin. Combining 13 CO 2 and H 13 CO 3 into one variable is backed up by the fact that both entities are in quasi-equilibrium due to the fast reaction catalyzed by the enzyme carboanhydrase. The model takes into account the exchange of 13 CO 2 + H 13 CO 3 and 13 C-methacetin between the blood and other body compartments (denoted with X and Y, respectively). Earlier studies of Barstow et al. (1985Barstow et al. ( , 1990 and Irving et al. (1983) on the 13 C-bicarbonate kinetics in humans have provided evidence for the existence of at least three CO 2 exchange compartments differing in the characteristic time constants for the exchange kinetics. Over shorter time ranges of about 15-30 min as considered in our test, the error made by neglecting the slow exchange processes should remain sufficiently small.
The reaction scheme shown in Fig. 2 differs from the reaction scheme used in our previous work (Holzhutter et al. 2013) in the following items: (i) the reversible exchange of the drug with non-hepatic compartments is now taken into account, (ii) the kinetics of the reaction product acetaminophen is not considered in the model as the plasma concentration of this drug was not monitored, (iii) exchange of 13 C-methacetin with the liver, conversion to the reaction products acetaminophen and 13 CO 2 and release of the formed 13 CO 2 into the blood plasma are lumped together into a single process with rate constant k L . The latter simplification was necessary as the reversible exchange of 13 C-methacetin with the liver and with extra-hepatic organs cannot be discerned in the shape and magnitude of the DOB curve. Note that the parameter k R represents the total elimination rate of methacetin from the plasma, including the release of the drug from the plasma into the breath as well as other modes of irreversible loss as, for example, renal excretion and slow covalent fixation in organic molecules.

Kinetic equations
The reaction scheme shown in Fig. 1 is governed by the following set of ordinary differential equations: Fig. 2 Scheme of the compartment model. The variable Δ 13 C denotes the sum of additionally (above baseline!) formed 13 CO 2 and H 13 CO 3 , 13 M denotes 13 C-methacetin. Both metabolites can be exchanged between the plasma and other body compartments lumped together into single compartments (X and Y, respectively). Uptake of 13 C-methacetin into the liver, conversion to the reaction products acetaminophen and 13 CO 2 by CYP1A2, and release of the enzymatically formed 13 CO 2 into the blood plasma are described by an overall reaction with rate constant k L . Phase 1 start with the intravenous administration von 13 C-labeled bicarbonate. Phase 2 starts about 30 min later with the intravenous administration von 13 C-labeled methacetin (see test protocols) For the meaning of the rate constants in equation system (2), see legend to Fig. 1. Note that the variables Δ 13C * X and 13M * Y represent effective concentrations in the compartments X and Y that are related to the true concentrations C X and M Y by the relations

Definition of a scoring function for the quantification of hepatic detoxification capacity
Different combinations of model parameters were tried to define an appropriate scoring function that reliably quantifies the hepatic detoxification capacity. The power of a scoring function to discriminate between the detoxification capacities of healthy and diseased livers was evaluated by means of a ROC (receiver operator characteristic) curve analysis. The area under the ROC curve (AUC) is an effective and widely used method for evaluating the discriminating power of a diagnostic test or statistical model. Threshold values of model parameters used in the scoring function were chosen such that the Youden index, Y = TP (1-FP)-1 (TP-true positive rate = fraction of ill subjects correctly classified, FP-false positive rate = fraction of healthy subjects falsely classified) becomes maximum. Assessment of the statistical significance of the difference between any two classifiers based on the AUC was performed by a non-parametric test that accounts for the correlation of the ROC curves (Vergara et al. 2008). Sensitivity of classification results against random variations of the training set of healthy and diseased subjects was checked by means of Bootstrap resampling (Mossman 1995) based synthetic data sets created by random selection (with permitted repetition) of original data. (2)

Adjustment of the compartment model to measured DOB data
Both variants of the methacetin breath test, the conventional LiMAx test initiated by the injection of the test drug and subsequent monitoring of 13 CO 2 in the breath, and the proposed novel 2DOB test were carried out in 38 subjects (for test procedures see Fig. 1). Figure 3 shows typical DOB curves of 2DOB test obtained in a liver-healthy subject and a patient with liver cirrhosis. Experimental and computed DOB curves of all subjects are given in Fig. 1 of the Supplementary Information. The DOB curve data were used to parameterize the compartmental model, i.e. to estimate numerical values for the model parameters (see Supplementary Information). Model fitting yielded numerical estimates for the rate constants k +C , k −C of Δ 13 C exchange with body compartment X, the rate constant k R of irreversible Δ 13 C elimination, the rate constants k +M , k −M of 13 M exchange with body compartment Y and the rate constant k L for the hepatic detoxification of 13 M. The two-step procedure applied for fitting subsets of model parameters separately to the first and second phase of the DOB curve was chosen as some parameters have a partially redundant influence on the height and shape of the DOB curve. For example, an increase of DOB values can be elicited by an increase of k L or a decrease of the CO 2 elimination rate kR. The sensitivity analysis in the Supplement "Computational Details" illustrates that estimation of numerical parameter values from different parts of the 2DOB curve is necessary for unequivocal estimates of the parameters k +M and k −M .
From the DOB data of the LiMAX test, only for the four model parameters, k R , k +M , k −M and k L , numerical estimates could be derived from the data. Supplementary  Table 1 depicts for each subject the numerical values of all model parameters.
Once parameterized, the model allows simulating the time course of Δ 13 C and 13 M in the compartments X and Y, respectively. An example is shown in Fig. 3. Compared with the healthy subject (HS), the patient with cirrhosis (DS) displays a higher uptake rate of Δ 13 C into body compartment X after administration of an identical bolus of 13 C-bicarbonate. Nevertheless, the rise of Δ 13 C in compartment X of DS is lower owing to the significantly lower detoxification rate of 13 M and correspondingly lower liberation rate of Δ 13 C into the plasma (mirrored by the lower maximum of the DOB curve). Note that 40 min after administration of the test drug, a substantial portion of Δ 13 C is still retained in body compartment X. The model simulation predicts a faster decline of plasma 13 M in HS compared with DS. This is a consequence of DS having both a lower detoxification rate of 13 M and a lower uptake rate of 13 M into compartment Y (Fig. 2d, e). We validated the model predictions by the good concordance between the theoretical curves and measurements of 13 M at various time points after 13 M administration (Fig. 2b). It has to be mentioned that for subjects with very low values of the detoxification rate constant k L , there was a tendency of the computed time course of plasma methacetin to underestimate the true (= experimentally determined) clearance of 13 M from the plasma (see Fig. 2, Supplementary Information). This discrepancy is due to the fact that with decreasing rise of the DOB curve after 13 M administration, the two parameters k +M and k −M for the exchange of 13 C-methacetion with compartment Y become less identifiable from the DOB curve (see Supplement "Computational details", Parameter Identifiability). At the extreme, if there is no rise of the DOB curve at all, indicating that detoxification of the drug is absent (k L = 0), the numerical values of k +M and k −M cannot be determined as the reversible exchange of 13 C-methacetin with compartment Y is not connected with the 13 CO 2 kinetics.

Influence of systemic CO 2 kinetics
The values of the model parameters k +C , k −C and k R derived from the first part of the DOB curve provide information on the systemic CO 2 /HCO 3 kinetics of the subject. The numerical estimates for the Δ 13 C elimination rate k R varied between 0.085 and 0.40/min, the mean value across the 38 subjects amounts to 0.22/min. For a subject with a normal acid-base status of 25 mM plasma CO 2 and a blood volume of 5 L, this translates into an average diurnal elimination rate of about 38 mol/day. This value is above the known resting value of about 20 mol/day for the respiratory CO 2 elimination rate (Jonderko et al. 2008). Hence, on the average, about one-half of the eliminated Δ 13 C must be rapidly taken up by a "fast" body compartment in a quasi-irreversible manner such that open circles) and the patient with liver cirrhosis (red curve, closed circles). c Simulated time course of additionally formed 13 C-CO 2 / bicarbonate (Δ 13 C X *) in body compartment X. Note that relative concentrations are shown which are linearly related to the true concentrations by a scaling factor that depends on the unknown volume of compartment X. d Simulated time course of 13C-methacetin ( 13 M Y *) in body compartment Y. Note that relative concentrations are shown which are linearly related to the true concentrations by a scaling factor that depends on the unknown volume of compartment Y. e Flux changes in DS compared to HS. Bold arrow: increased flux, dotted arrow: decreased flux the slow return of this fraction to the plasma is not observable within the time span of the test. This is also reflected in an average 13 CO 2 recovery rate of 54% in the exhaled breath at T = 15 min after 13 C-bicarbonate administration (for individual recovery rates see Fig. 4, Supplementary  Information).
Notably, the distribution of values for the CO 2 elimination rate k R in the group of patients with a liver disease exhibits a general shift towards small values (see Fig. 3, Supplementary Information). Six liver patients had k L values smaller than 0.17 min −1 (≈ 66% of the mean). Nevertheless, there was no statistically significant difference between the mean k L values in the group of healthy subjects (0.222/min) and diseased subjects (0.220/min) according to the two-sample t test (p = 0.77 at α = 0.05). Figure 4 illustrates how individual variations of the systemic CO 2 kinetics may influence the shape of the DOB curve at fixed 13 C-methacetin kinetics. The curves were simulated with fixed values of the kinetic parameters k L , k +M , k −M determining the methacetin kinetics, but varying values of the kinetic parameter k +C , k −C , k R obtained for the 38 subjects of this study. The resulting curves differ substantially in characteristic curve parameters such as peak value, timeto-peak value, area under the curve and steepness of decline. For subject #33 with normal liver function, the peak values of the DOB curves (see Fig. 4a) may differ from the "true" peak value by more than a factor of 2. Extreme DOB curves would be produced if subject #33 had the CO 2 kinetics of subject #23 (maximal positive deviation) or subject #28 (maximal negative deviation). The coefficient of variation (CV = standard deviation/mean) of peak values is 0.25. For subject #37 with liver cirrhosis taken as reference, the spread of DOB curves is generally lower than for the normal subject (CV of peak values = 0.14). For the majority of DOB curves, the peak values at T = 300 s. differ less than a factor of 1.5 from the reference curve.
This suggests that the likelihood of overestimating the true drug-detoxifying capacity owing to variations of the CO 2 kinetics is lower for a subject with a functionally impaired liver. The higher risk of over-or underestimating the 'true' hepatic detoxification capacity in subjects with normal or slightly impaired liver functionality compared with subjects with severe liver dysfunction is also reflected in the relationship between the detoxification rate constant k L and the parameter LiMAx serving as predictor of the detoxification capacity in the conventional LiMAx test and representing up to a scaling factor the peak value of the DOB curve (see Fig. 5). Despite the highly significant overall correlation of these two parameters, the largest deviations from the hyperbolic regression line occur in the middle range of k L values (grey-shaded area in Fig. 5) indicating either normal or mildly impaired detoxification capacity.

The predictive capacity of model parameters as disease classifiers
Next we studied the suitability of model parameters for the binary classification of the subjects liver into normal (disease class "1") or impaired (disease class "2"). Obviously, the model parameter k L quantifying the hepatic detoxification Fig. 4 Simulated DOB curves at fixed methacetin kinetics, but variable CO 2 kinetics. a The kinetic parameters determining the methacetin kinetics were fixed at those values obtained for subject #33 with normal liver function: k L = 0.039 min −1 , k +M = 0.036 min −1 , k −M = 0.082 min −1 . b The kinetic parameters determining the methacetin kinetics were fixed at those values obtained for subject #37 with liver cirrhosis: k L = 0.010 min −1 , k +M = 0.131 min −1 , k −M = 0.06 min −1 . The set of parameters, k +C , k −C , k R , determining the CO 2 kinetics were put to the values obtained for the 38 subjects, i.e. each curve represents the DOB curve of a subject comprising the methacetin kinetics of subject #9 (a) or subject #37 (b) but the CO 2 kinetics of one of the subjects 1-38. Note that the curves represent the second part of the 2DOB curves starting with the administration of 13 C-methacetin, i.e. time "0" corresponds to time T m of the full test curve. For a better comparison of curves, the residual DOB value at t = T m was subtracted so that the initial DOB values at time = 0 is zero for all curves. Blue curves: true DOB curve of the reference subject (subject #33 in A, subject #37 in b). Coefficient of variation (CV) of maximum DOB values: = 0.25 (a), 0.14 (b) rate of the test drug should serve as an appropriate classifier. Testing the predictive capacity of k L by means of a receiver operating characteristics (ROC) yielded a value of AUC = 0.85 for the area under the curve (AUC), a true positive rate (TP) of 0.75 and a true negative rate (TN) of 0.86. Thus, the predictive power achieved with the parameter k L was not better than the predictive power of the LiMAx score (AUC = 0.82, TP = 0.75, TN = 0.86). Notably, both k L and the LiMAx yielded false negative classifications for the same group of subjects. This liver suggests that a chronic liver disease must not be necessarily paralleled by a significantly lowered chemical conversion rate of the test drug.
We then tested whether other model parameters may serve as possible disease predictors. Intriguingly, parameters k −M describing the initial rate of methacetin uptake into the storage compartment Y yielded a surprisingly high quality of disease classification (AUC = 0.72). Parameter k +M describing the release rate of methacetin from compartment Y also yielded a statistically significant classification with AUC = 0.65. Hence the storage capacity of compartment Y for methacetin appears to be reduced in diseased livers. As the storage capacity of compartment Y depends on both k +M and k −M , we used the average concentration of methacetin stored in compartment Y over a time span T S = 3000 s. after administration of 13 M at time t = T M , (see Fig. 6) as a new disease classifier yielding an AUC value of 0.81. Remarkably, the ROC curve associated with the measure M L (3) M L = 1 3000 reached already a true positive rate of more than 90% at a false positive rate of about 30%, i.e. M L is predestined more than all other measures tested to reliably identify patients with liver disease. We then examined whether a combination of the two disease predictors, k L and M L , may provide an even better classification than either parameter alone. As there are endless possibilities to combine the model parameters into a scoring function, we followed Occam's razor command "not to multiply entities without necessity" (Schaffer 2015) and used the most simple (linear) combination of the tw model parameters k L and M L that individually provided the best classification results: The 2DOB score defined in Eq. (4) is a continuous measure of the hepatic detoxifications capacity, whereby values smaller than unity indicate an impaired liver function. The 2DOB score is basically identical with the detoxification rate k L except for those cases where k L remains below the threshold value k Lc (indicating an impaired detoxification rate) whereas the storage capacity M L is larger than the threshold value M Lc . This definition of the 2DOB score implicates that the liver is classified as impaired if and only if both parameters, k L and M L , are smaller than the cut-off values k Lc and M Lc (see Fig. 7).
With the 2DOB score used as classifier, indeed a significant improvement of the classification quality was achieved; (4)   Table 1 were determined by maximizing the AUC value across the group of 38 subjects.

Relationship between the 2DOB score and the MELD score
As the 2DOB score is defined as a continuous measure of the hepatic detoxifications capacity, it was tempting to check whether the 2DOB score correlates with clinically applied parameters that are used in the clinics for assessing the severity of chronic liver diseases. Figure 8 reveals a high correlation between the 2DOB score and the MELD (Model for End-Stage Liver Diseases) core of the 38 test subjects (R Spearman = 0.92, p < 0.00001).
There exists a clear monotone and highly non-linear relationship between the 2DOB score and the MELD score. 2DOB scores smaller than 0.4 are consistently associated with MELD scores larger than 8. The only outlier occurred for subject #33 (black data point in Fig. 8) who has a clinically inconspicuous liver, correctly reflected by a 2DOB score > 1, but an ordinarily high MELD score of 14 due to the fact that this subject was under steady treatment with anticoagulants which are known to influence the MELD score. As shown in Fig. 8, the 2DOB score proves as a much more sensitive indicator of a beginning or moderate liver disease than the MELD score. That is no surprise as the MELD was developed and adapted as a prognostic tool in advanced liver diseases (Lau and Ahmad 2013).
The classifications performed using the two age-corrected control groups separately (see Table 3, Supplementary Information), yielded identical statistical measures for the 2DOB test. Of note, the cut-off values for k L and M L were identical irrespective of the control group used. Changes in the ROC statistics of the LiMAx test were also very little affected by the choice of the control group. However, the cut-off value separating normal and diseased livers was significantly affected (282 versus 363) when using either the younger or older controls as reference.
We also tested the robustness of LiMAx-and 2DOBbased classifications against random variations of the training set of healthy and diseased subjects using the method of Bootstrap resampling (Mossman 1995). The frequency distribution of AUC values shown in Fig. 9 was constructed by choosing randomly with replacing 38 subjects from the original set, i.e. the randomly sampled set contained some data sets more than once (overrepresented) but some data sets were missing. For this synthetic data set, values of AUC and the cut-off parameters were computed. This resampling was applied 1000 times. The Bootstrap variance of AUC and of the cut-off values are also given in the legend to Fig. 9. In less than 2% of trials, the AUC 2DOB was below 0.8, the mean AUC 2DOB was 0.94.
Regarding the safety and operational feasibility of the test protocol, there were no complications with any participant. Nevertheless, to keep the time interval for the investigation as short as possible, we checked whether the duration of the test can be reduced to about 30 min in total. To this end, we used only a part of the DOB data recorded for 15 min after administration of 13 C-bicarbonate and for 15 min after administration of 13C-methacetin for the adjustment of the compartment model and estimation of kinetic parameters. With the short-time 2DOBtest, the 2DOB score yielded almost the same classification quality for the group of 38 subjects (AUC = 0.94, 4 misclassifications). The results of the 15 min test are given in Supplementary Table 1.

The need for non-invasive tests capable of detecting early-stage liver diseases
One of the main difficulties with liver diseases is that patients often do not present symptoms or signs until the disease becomes advanced. Around 50% of patients receive their first diagnosis when they arrive in accident and emergency units, typically in their 40 s or 50 s, with jaundice, gastrointestinal bleeding, abdominal swelling and disorientation. Detection of ongoing liver diseases at an early stage is required to prevent or reduce further disease progression by lifestyle changes and pharmacological treatment. Noninvasive diagnostic techniques currently used are serum biomarkers and transient elastography (TE). However, serum Fig. 9 Variability of AUC values for the 2DOB and the LiMAXx. Variability was assessed by Bootstrap resampling (1000 trials). Green-shaded bars: LiMAx. White bars: 2DOB score biomarkers are not liver specific and TE results require an expert clinician for interpretation (European Association for Study of Liver and Asociacion Latinoamericana para el Estudio del Higado 2015). Regarding the use of breath tests for the detection of liver diseases, the elevation in the breath of naturally occurring volatile biomarkers such as limonene, methanol and 2-pentanone has been reported to reliably identify patients liver cirrhosis (Fernandez Del Rio et al. 2015). The same quality of discrimination between healthy and cirrhotic livers has been achieved with the LiMAx test (Buechter et al. 2018). However, differentiating between patients with and without cirrhosis alone has limited value, as this can be performed reliably with routine clinical methods. It is still a challenge to establish non-invasive liver functions tests that are sensitive and specific enough to detect the early onset of restrictions in the metabolic capacity of the liver. Taking for granted that the metabolization of drugs is a reliable indicator for the functional capacity of the liver as a whole, functional breath tests such as the LiMAx are promising test candidates provided that major confounding factors can be eliminated.

Exhaled 13 CO 2 as reliable indicator of hepatic detoxification capacity
Using the kinetics of exhaled 13 CO 2 produced during hepatic metabolization of a labeled test compound as indicator of the liver's detoxification capacity inevitably raises the problem to what extend this signal is influenced by the systemic distribution of CO 2 . Measuring the rate of CO 2 respiration doesn't really solve this problem as this measure does not capture the exchange of plasma CO 2 with other body compartments. Therefore, we designed a test variant that allows to assess how a defined bolus of 13 CO 2 is eliminated from the plasma of the patient and to exploit this information for a more reliable assessment of the drug detoxification rate. To this end, we used a fit-for-purpose compartment model that condenses the numerous physiological processes involved in the hepatic uptake and metabolization of the drug and the systemic distribution of CO 2 into a manageable number of phenomenological parameters. The model provided an excellent fit to measured DOB curve data of all 38 subjects included in this study (see Fig. 1, Supplementary Information). Based on the high variability of the individual kinetic parameters for systemic CO 2 kinetics (illustrated in Fig. 4), one may expect variations in the maximum of the DOB curve by a factor of 2 for normal subjects and a factor of 1.5 for subjects with restricted detoxification capacity.
The direct evaluation of the capability of the novel 2DOB breath test to provide reliable estimates of the hepatic drug detoxification rate would ultimately require measurements of arteriovenous drug concentration differences across the patient's liver. As this has to be excluded for obvious reasons, an indirect evaluation may consist in testing the capability of the 2DOB test to correctly discriminate between liver healthy and clinically diagnosed liver-diseased subjects. Taking into account the patient-specific CO 2 kinetics in the estimation of the hepatic conversion rate of 13C-methacetin (parameter k L ) provided a modest improvement of the discrimination between normal versus impaired detoxification capacity compared with the conventional LiMAx test. This is plausible because the likelihood that the "true" maximum of the DOB curve (the basis for the definition of the LiMAx score) of patients with severe liver dysfunction is falsely shifted up to values within the normal range just due to an extreme CO 2 kinetics remains small: The coefficient of variation for the exemplary case shown in Fig. 4b was 0.14. The same holds for the likelihood of a strong down shift of the DOB maximum for subjects with high detoxification capacity. However, for patients with detoxification capacities close to the borderline between normal and mild metabolic dysfunction, the correction of DOB values for the influence of the individual CO 2 kinetics may be relevant. 7 out of 8 the cases misclassified by the LiMAx, but correctly classified by the 2DOB parameter k L are just in the middle range of k L and LiMAx values (see grey-shaded region if Fig. 5) where a marginal shift in the maximum of the DOB curve may turn the classification result into the opposite.
In summary, the already well-established LiMAx score appears to be robust against the bias caused by the systemic CO 2 kinetics in subjects comprising either normal or drastically reduced hepatic detoxification capacities. However, in subjects with borderline capacities, the 2DOB test promises a significant improvement of the classification quality.

Plasma clearance of methacetin = metabolization and temporary storage
Generally, owing to its chemical similarity with acetaminophen, methacetin should rapidly and evenly distribute throughout most tissues and fluids as reported for acetaminophen (Forrest et al. 1982). Therefore, it is difficult to specify the volume of model compartment Y to convert the apparent rate constants into true rate constants. An intriguing finding of our study was that a substantial part of the test drug should be transiently stored in a body compartment that is closely associated with the liver. Distinction between two modes of plasma clearance of methacetin, reversible storage (without detoxification) and detoxification, was possible because the reversible exchange of the drug with the storage compartment influences the shape of the declining part of the DOB curve (see sensitivity analysis in Fig. 5 of the Supplementary Information). The decline of the DOB curve is influenced by both the uptake of 13 CO 2 from the plasma into other body and the uptake of 13 C-methacetin into the exchange compartments. Hence, the share of the CO 2 kinetics in the shape of the declining part of the DOB curve has to be known to estimate reliable values for the relevant parameters, k +M and k −M , determining the magnitude of the parameter M L . An issue with the correct estimation of M L occurs if the DOB curve is very flat, i.e. for livers with very low detoxification capacity k L (see Fig. 3b). Fortunately this is no impact on the 2DOB score because for very small values of k L the estimated value of M L is also small, i.e. M L << M Lc , so that the 2DOB score is identical with k L and the liver is classified as functionally impaired.
The physiological and anatomical nature of methacetinstoring body compartments which we condensed into the single model compartment Y remains elusive. Owing to the chemical similarity of methacetin with acetaminophen (APAP) we searched the literature for known facts about the pharmacokinetics of APAP. A recently published compartment model of APAP clearance from the plasma worked well without introducing a reversible exchange of APAP (Mian et al. 2019). Reversible binding of acetaminophen to plasma proteins like albumin (the plasma level of which can be reduced in severe liver failure) can be excluded as this is a very rapid process in contrast to the rather slow uptake and release according to our modeling data. With an association rate constant of about 5.8 × 10 4 /M/s as determined for binding of tryptophane to albumin (Talbert et al. 2002) and an average albumin plasma concentration of 40 g/l = 600 μM, the apparent first-order rate constant for drug binding would be about 2000/s which is about five orders of magnitude larger than typical values of the model parameter k −M . Thus, it is tempting to assume that compartment Y is identical with the liver itself, whereby the transient storage without chemical conversion is either confined to a special fraction of hepatocytes or occurs in an intra-cellular compartment of all hepatocytes. Hepatocytes have cytosolic binding proteins (e.g. ligandin) that act as storage compartment for drugs and endogenous metabolites. Therefore, with loss of hepatocytes, there is loss of storage capacity. This explanation receives support from our observation that the storage capacity M L was lower in the group of older controls compared with the group of younger controls (see section "Robustness of the 2DOB test" below), possibly reflecting the age-dependent decline of active liver mass (Wakabayashi et al. 2002). As binding of the drug to cellular binding proteins will be similarly fast as binding to serum proteins, the rate constants for the reversible uptake to and release of from hepatocytes should reflect the transport rates across the plasma membrane (Austin et al. 2005). The capacity of the liver to efficiently remove drugs and other xenobiotics from the circulation may also be determined by its capacity to transiently store the drug before it can be chemically converted. This may in particular hold in situations when the detoxifying enzyme systems become saturated. Testing this hypothesis would mean to quantify the arteriovenous difference of labeled 13 C-methacetin and 13 CO 2 in a laboratory animal. Anyway, storage of the test drug without immediate chemical conversion in the liver appears to be a mechanism that contributes to the rapid clearance of the drug from the plasma.

Conclusion
Individual variations in the systemic CO 2 kinetics may have a significant influence on the parameters of the DOB curve. The novel test variant 2DOB takes this confounding effect into account and promises a significant improvement in the assessment of impaired hepatic detoxification capacity compared to the well-established LiMAx test in cases where the detoxification capacity is at the borderline between the normal and moderately reduced level. The suitability of the test for the reliable characterization of the natural history of chronic liver diseases (fatty liver → fibrosis → cirrhosis) has to be assessed in further studies. Validation of the test in an animal model would be perfect to remove remaining uncertainties of the model-based analysis by direct measurement of hepatic CYP activities and concentration profiles of 13 CO 2 and 13 C-methacetin in different body compartments during the long-term progression of a liver disease (e.g. non-alcoholic fatty liver NAFLD).