FormalPara Key Points

Collectively, the data demonstrate that in vitro-derived induction parameters from CYP3A4 mRNA level changes can be confidently used with basic models to evaluate clinical induction potential. Further correction factors are proposed, which, when applied to basic equations, can reduce the number of false positives, more accurately predict true positives and negatives for binning and improve the quantitative translation of induction based DDI risk

 

1 Introduction

There is a need to assess the DDI potential of compounds by evaluating their propensity to inhibit or induce cytochrome P450 (CYP) enzymes, Phase II enzymes and transporters. If a compound is a CYP inducer, the resulting increase of enzyme can lead to increased biotransformation of itself (autoinduction) or co-medicants. The primary mechanism by which drugs cause enzyme induction is by the activation of gene transcription. Therefore, to assess induction of metabolizing enzymes and transporters through receptor-mediated agonism of PXR (i.e., CYP3A4, CYP2C9, P-gp, UGT), CAR (i.e., CYP2B6, CYP2Cs, SULT, UGT) and AhR (i.e., CYP1A2, UGT1A1, SULT1A1), cryopreserved human hepatocytes are often used, with known positive inducers, to provide confidence in the performance of the in vitro model. It is well accepted that the mechanisms underlying induction are consistent with other pharmacological processes, and, as such, characterizing the concentration response is an integral part of risk assessment. Regulatory agencies recommend that the induction potential of a new chemical entity (NCE) be evaluated using plated human hepatocytes from three separate donors. They further propose basic equations to be used to evaluate the clinical risk based on the in vitro-derived induction parameters (Table S1). The equations are referred to as F2, R3 and relative induction score (RIS) and employ the induction parameters, including the concentration resulting in twofold increase in mRNA (F2), the concentration resulting in half maximal fold induction (EC50) and the maximal fold induction (Emax), derived from nonlinear fitting of mRNA fold changes referenced to a housekeeping gene. Due to the importance of CYP3A in the metabolism of xenobiotics as well as the clinical precedence of strong induction, many recommendations have focused on CYP3A [1,2,3,4]. Although work on the translation of in vitro data for other enzymes has been conducted, it remains limited [5, 6]. Recently, a large dataset for clinically relevant CYP3A inducers was collected and analyzed as part of the International Consortium for Quality, Induction Working Group (IQ-IWG)'s efforts; however, there were minimal data for in vitro inducers that are not also in vitro inhibitors and clinical induction is negative. Therefore, the goal of this work was to expand the true negative dataset, using clinical data for selective CYP3A substrates, to develop recommendations to basic models (F2, R3 and RIS) for interpreting their induction risk.

2 Materials and Methods

2.1 Chemicals and Reagents

Williams E Media, Hepatocyte Thawing Medium (HTM), Hepatocyte Thaw and Plate Supplement (HTPM), Primary Hepatocyte Maintenance Medium Supplement (PHMM), Hanks' Balanced Salt Solution (HBSS), Geltrex™ LDEV-Free hESC-Qualified, Reduced Growth Factor Basement Membrane Matrix, Collagen-I-coated 96-well plates, mRNA Catcher™ PLUS Purification Kit, TaqMan™ Reverse Transcription Reagents, TaqMan™ Fast Advanced Master Mix and TaqMan™Gene Expression Assay (hGAPDH HS99999905_m1, CYP3A4 HS00604506_m1) were obtained from Life Technologies, (Carlsbad, CA, USA). Chemicals acebilustat, delafloxacin, lesinurad, raltegravir rifaximin rilpivirine, tafamidis, tasimelteon and tenofovir were acquired from Millipore Sigma (St. Louis, MO, USA). Binimetinib, doravirine, selexipag/NS 304 and sofosbuvir were from Thermo Fisher Scientific, (Waltham, MA, USA). Human cryopreserved hepatocytes were obtained from Bioreclamation In Vitro Technologies (Baltimore, MD, USA). All other reagents were of sufficient grade or purity.

2.2 Identification of True Negative Clinical Inducers

To identify true negative clinical inducers, for derivation of in vitro induction kinetic parameters, a targeted search using the University of Washington Drug-Drug Interaction database (UW-DIDB) was conducted. All clinical studies testing an effect (inhibition, no inhibition, induction, no induction or other PK interactions) were collected for the sensitive substrates highlighted in Table S2. Both positive (area under the curve ration [AUCR] < 0.8) and negative (AUCR ≥ 0.8) clinical induction studies were collated for comparisons. Additionally, positive and negative clinical inhibition studies for the same substrate set were included to cross check whether the negative inducers resulted in any relevant clinical inhibition. Once the test set was identified, the literature and UW-DIDB were searched for in vitro induction and inhibition parameters.

2.3 Culture of Cryopreserved Human Hepatocytes

Human cryopreserved hepatocytes from both male and female donors of differing age and racial origin are illustrated in Table S3 and treated as previously described with minor modifications [7]. In short, cryopreserved hepatocytes were thawed for 1.5 to 2 min using a 37 °C water bath and combined with HTM prewarmed to 37 °C. The tube was capped and gently inverted four times to mix cells followed by centrifugation at 100g for 10 min. Following centrifugation, supernatant was carefully aspirated without disturbing the pellet. Cells were re-suspended by the addition of 2–3 ml per vial in HTPM. Cells were manually counted by trypan blue exclusion and resuspended to a final viable cell count of 1.2 × 106 cells/ml in HTPM. Prior to plating, Collagen-I coated 96-well plates wells were pre-wet with 50 µl HTPM and prewarmed to 37 °C. To seed plates, 50 µl cell suspension was added to the prewet warmed plates to achieve a final seeding density of 6.0 × 104 cells/well. Cells were distributed within the well by sliding the plate approximately 2 inches in a north, south, east, west (↑↓→←) direction from center and placed in a humidified incubator at 37 °C with 5% CO2. Following an approximate 4-h attachment period, plates were removed and media was aspirated to remove any non-attached hepatocytes and cellular debris. Sandwich cultures were constructed by the addition of 100 µl ice cold PHMM containing 0.35 mg/ml Geltrex. Plates were returned to a 37 °C, 5% CO2 humidified incubator for ~ 20 h. Compounds were dissolved in DMSO and added to the PHMM at various concentrations, Table S4 (final DMSO concentration, 0.1%), and incubations were conducted in triplicate. Treatment with test compound, vehicle or prototypical inducers was initiated ~ 24 h after seeding. Media were aspirated and replaced with fresh media containing test article at 48 h post-seeding. On the 2nd day of treatment, metabolic stability was characterized by the removal of aliquots from the remaining dosing solutions to serve as “0 min,” and media samples were collected at 4 and 24 h. Cell viability was assessed by visual inspection of the monolayer, checking for confluency and morphology. After a total treatment time of 48 h, the medium was removed, and the cells were washed with 100 µl HBSS. After washing, HBSS was aspirated and plates were sealed and stored at – 80 °C prior to RNA isolation.

2.4 Determination of Relative mRNA Levels

The cells were lysed using lysis buffer and prepared for RNA isolation. After the isolation of RNA using commercially available kits, cDNA was synthesized using standard polymerase chain reaction (PCR) protocols. CYP3A4 and an endogenous housekeeping gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) were quantified by real-time PCR. The gene-specific primer/probe sets were obtained from Applied Biosystems, and real-time PCR was performed using CYP3A4 and the endogenous control target cDNAs. The relative quantity of the target cDNA compared with that of the endogenous control was determined by the ΔΔ threshold cycle method (Applied Biosystems User Bulletin 2). Threshold cycle values > 32 were excluded from the analysis. Relative quantification measured the change in mRNA expression in test samples relative to that in vehicle control sample (0.1% DMSO).

2.5 LC-MS/MS Analysis

The system used for LC-MS/MS analysis consisted of an ExionLC HPLC system and a Triple Quad 4000 or 4500 tandem mass spectrometer (AB Sciex, Framingham, MA). A 10-µl aliquot of sample was injected for analysis using the peak area ratio of analyte compared to proprietary internal standard (m/z 468.919 → 133.089 IS POS or 467.22 → 85.00 IS NEG). High-performance liquid chromatography separation was accomplished using Kinetex XB-C18 column (2.1 × 50 mm, 5 µm: Phenomenex, Torrance, CA). Mobile phase A consisted of 0.1% formic acid in water, and mobile phase B was 0.1% formic acid in acetonitrile. Analyte-specific gradient conditions and multiple reaction monitoring MS/MS details are presented in Supplemental Table S5.

2.6 Equations

The AUC0-24h (AUCall(Cobs)) was calculated using the linear trapezoidal rule (Eq. (1)), and the average concentration over the last day of dosing was calculated by dividing the AUC0–24h by 24 h.

$${\text{AUC}}_{{{\text{all}}({\text{Cobs}})}} = \sum\limits_{{t = 0}}^{{n = 1}} {\frac{{\left( {C_{i} + C_{{i + 1}} } \right) \times \left( {t_{{i + 1}} - t_{i} } \right)}}{2}}$$
(1)

The free fraction of each test article in culture medium, fu,in vitro, was estimated using Eq. (2), where Funbound and Fbound refer to the unbound fraction and bound fraction of drug in culture medium or plasma, and [ALBUMIN] refers to the albumin concentration in plasma (42.5 mg/ml) or in incubation medium (1.25 mg/ml). There are several assumptions made when applying this equation including that (1) the plasma protein binding is primarily to albumin, (2) there is one binding site on albumin and (3) the binding affinity is similar between bovine serum albumin and human serum albumin.

$$\mathrm{Binding \; affinity}= \frac{\left[\mathrm{ALBUMIN}\right]\times {F}_{\mathrm{unbound}}}{{F}_{\mathrm{bound}}}$$
(2)

2.7 Estimation of EC50 and E max Parameters from In Vitro mRNA Hepatocyte Data

Concentration dependence was evaluated using standard linear regression and Spearman’s nonparametric rank correlation coefficient. If the r value from Spearman’s was positive with a significant p-value (< 0.05) and/or the slope of the simple linear regression model was positive, and the 95% confidence intervals did not contain zero, then non-linear regression curve fitting was performed. This step was adopted from [8] and is based on a decision tree to evaluate time-dependent inhibitors [9].

Induction parameters were determined using nominal (recovered) concentrations or the average concentration over the last 24 h, with and without consideration of the estimated fu,in vitro. F2 was calculated by solving Eq. (3) for the concentration that resulted in twofold increase in mRNA level. The clinical relevance of induction potential was assessed by calculating the R3 value using Eq. (4), described in the final US Food and Drug Administration (FDA) DDI guidance [10] and Pharmaceutical and Medical Devices Agency (PMDA) Drug Interaction Guideline [17]. In line with the European Medicines Agency (EMA) guideline, the F2/Cmax,ssu was calculated, and cut-off values were evaluated. In addition, previously determined RIS equations were used to evaluate generic RIS models (Eqs. 5 and 6).

$$Y = E_{\max } /\left( {1 + \exp \left( { - \left( {X - {\text{EC}}_{50} } \right)/b} \right)} \right)$$
(3)

Y is the relative fold induction, b, is the hill slope, [X] is the test article concentration, EC50 is the concentration eliciting half-maximal induction, and Emax is the maximum fold induction. In Eq. (4), X is the correction factor applied to the Cmax,u and d is the scaling factor and assumed to be 1; Cmax,u is the maximal unbound plasma concentration of the inducer.

$$R3= \frac{1}{1+d\times \frac{{E}_{\mathrm{max}}\times {C}_{\mathrm{max},\mathrm{u}}\times X}{{\mathrm{EC}}_{50}+{C}_{\mathrm{max},\mathrm{u}}\times X}}$$
(4)
$$\mathrm{RIS}= \frac{{E}_{\mathrm{max}}\times {C}_{\mathrm{max},\mathrm{u}}}{{\mathrm{EC}}_{50}+{C}_{\mathrm{max},\mathrm{u}}}$$
(5)
$$\% \; \mathrm{decrease \; AUC}=\frac{22+(100 -22)}{1+{10}^{0.0963-\mathrm{log}(\mathrm{RIS})}}$$
(6)

The mechanistic static model previously reported [11, 12] was also used, Eq. (7).

$$\frac{{{\text{AUC}}_{i} }}{{{\text{AUC}}}} = \frac{1}{{F_{g} + \left( {1 - F_{g} } \right) \times \left( {\mathop \sum \nolimits_{k = 1}^{n} \frac{{fm\left( E \right)_{g,k} }}{{A_{g,k} \times B_{g,k} \times C_{g,k} }} + 1 - \mathop \sum \nolimits_{k = 1}^{n} fm\left( E \right)_{g,k} } \right)}} \times \frac{1}{{\mathop \sum \nolimits_{k = 1}^{n} \frac{{fm \left( E \right)_{h,k} }}{{A_{h,k} \times B_{h,k} \times C_{h,k} }} + \left( {1 - \mathop \sum \nolimits_{k = 1}^{n} fm\left( E \right)_{h,k} } \right)}}$$
(7)

where A is the reversible inhibition, B is the time-dependent inhibition, C is the induction, g is the gut, h is the liver, and k is the enzyme.

Since the enclosed analysis specifically identified inducers which were not reported to be inhibitors, the inhibition portions (reversible and time dependent) of the equation were removed, resulting in Eq. (8), which includes Eqs. (9) and (10), representing the induction in gut and liver, respectively.

$$\frac{{{\text{AUC}}_{i} }}{{{\text{AUC}}}} = \frac{1}{{F_{g} + \left( {1 - F_{g} } \right) \times \left( {\mathop \sum \nolimits_{k = 1}^{n} \frac{{fm\left( E \right)_{g,k} }}{{C_{g,k} }} + 1 - \mathop \sum \nolimits_{k = 1}^{n} fm\left( E \right)_{g,k} } \right)}} \times \frac{1}{{\mathop \sum \nolimits_{k = 1}^{n} \frac{{fm \left( E \right)_{h,k} }}{{C_{h,k} }} + \left( {1 - \mathop \sum \nolimits_{k = 1}^{n} fm\left( E \right)_{h,k} } \right)}}$$
(8)
$${C}_{g,k}=\frac{1}{1+ \frac{d \times {E}_{\mathrm{max}}\times {I}_{\mathrm{gut}}}{{I}_{\mathrm{gut}}+{\mathrm{EC}}_{50}}}$$
(9)
$${C}_{h,k}=\frac{1}{1+ \frac{d \times {E}_{\mathrm{max}}\times {I}_{\mathrm{liver}}}{{I}_{\mathrm{liver}}+{\mathrm{EC}}_{50}}}$$
(10)

Calculation of Igut and Iliver was conducted using Eqs. 11, 12 and 13, respectively.

$${I}_{\mathrm{gut}}=\frac{1\times \mathrm{Ka}\times \mathrm{Dose}}{\mathrm{Qg}}$$
(11)
$${I}_{\mathrm{liver}}=\frac{\left(1-H\right)\times {f}_{u}}{\mathrm{Rb}}\times {I}_{\mathrm{inlet},\mathrm{max}}$$
(12)
$${I}_{\mathrm{inlet},\mathrm{max}}=\mathrm{Rb}\times {C}_{\mathrm{max},\mathrm{plasma}}+\frac{\mathrm{Fa}\times \mathrm{Fg}\times \mathrm{Ka}\times \mathrm{Dose}}{\mathrm{Qh}}$$
(13)

where blood flow in the gut (Qg) = 300 ml/min, blood flow in the liver (Qh) = 1.62 l/min, H is the hematocrit and assumed to be 0.45 and Rb is the blood-to-plasma ratio. Fa is the fraction absorbed, Fg is the fraction escaping gut metabolism, and ka is the absorption rate constant assumed in the analysis to be 1, 1 and 0.03 min−1, respectively.

Consistent with work recently published, evaluating time-dependent inhibition of CYP3A, multiple iterations of the above model with various [I] input values were considered [13], Fig. S1.

Model 1: Using gut concentration and unbound hepatic inlet as input parameters as described in regulatory guidance.

Model 2: Inputting gut concentration corrected with fu,p and unbound hepatic inlet.

Model 3: Inputting unbound hepatic inlet as the gut concentration and unbound Cmax,ss as the hepatic concentration.

Model 4: Inputting the unbound average hepatic inlet concentration as the gut concentration and the average unbound concentration as the hepatic concentration.

Models were evaluated with published Fm values (Supplemental Table S12).

Equations presented in [14] were used to describe the clinical study impact of in vitro risk assessment.

The accuracy of the individual models was evaluated by deriving the geometric mean fold error (GMFE) according to Eq. (14).

$$\mathrm{GMFE}= {10}^{\mathrm{mean}\left(\left|\mathrm{log}\frac{\mathrm{predicted \; DDI}}{\mathrm{observed \; DDI}}\right|\right)}$$
(14)

The root mean square error (RMSE) was calculated according to Eq. (15).

$$\mathrm{RMSE}=\sqrt{\frac{\sum {\left(\mathrm{predicted \; DDI }-\mathrm{observed \; DDI}\right)}^{2}}{\mathrm{number \; of \; predictions}}}$$
(15)

3 Results

3.1 Identification of Clinically Negative Inducers for Derivation of Induction Parameters

Seventeen candidates were identified for in vitro evaluation including raltegravir, lesinurad, lersivirine, perampanel, flumazenil, sofosbuvir, tenofovir, rilpivirine, binimetinib, brivaracetam, delafloxacin, doravirine, rifaximin, selexipag, tafamidis, tasimelteon and acebilustat. Data for lersivirine, perampanel and flumazenil were available from [4], and brivaracetam was not commercially available. Searching the literature and the UW-DIDB confirmed that the selected test articles were not in vitro or clinically relevant inhibitors.

3.2 Determination of In Vitro Binding to Hepatocyte Medium and Metabolic Stability

A binding affinity equation was used to calculate the free fraction in media, fu,in vitro (Eq. 3). In the case of high PPB (> 96%), free concentrations were decreased (> 25%), which was accounted for by multiplying the nominal EC50 or F2 values by the calculated fu,inc (Table 1). Metabolic stability was determined during the last 24 hours of the incubation (Table 1). There was limited depletion (< 25% loss observed when comparing the Cavg to the T0 concentration) for acebilustat, delafloxacin, doravirine, lesinurad, raltagrevir, rifampicin, rifaximin, tafamidis and tenofovir. A cut-off of > 25% was arbitrarily selected since it represents a change beyond typical bioanalytical assay variability. A correction factor of 0.2 was applied to nominal concentrations of pleconaril which undergoes nonspecific binding [15]. Pleconaril is also highly bound to plasma proteins (fu 0.01) with an estimated fu,in vitro of 0.256. Several compounds in the dataset have reported PPB > 99%, including Cmpd 11 (fu 0.003), Cmpd 2 (fu 0.0007), efavirenz (fu 0.0025), rilpivirine (fu 0.003), selexipag (fu < 0.01) and tafamidis (fu 0.005). In these cases, the value for fu was set to 0.01, as recommended in the regulatory guidance [10, 16, 17]. Other positive inducers which were 99% protein bound included Cmpd 7 and Cmpd 8. Correction of in vitro binding was made under the assumption that the in vitro medium was consistent with the study conditions employed here [contained 1.25 mg/ml bovine serum albumin (BSA)]. While this medium formulation is common, it is likely that some of the variability observed in the induction parameters derived between laboratories may be in part due to media formulation differences [18,19,20]. Of note, while this study did not investigate bosentan, rosiglitazone or pioglitazone, the calculated fu,in vitro predictions of 0.41, 0.25 and 0.16 compared well with observed fu,in vitro of 0.38, 0.22 and 0.15 for bosentan, pioglitazone and rosiglitazone, respectively [15].

Table 1 Summary of derived in vitro binding parameters and metabolic stability results

3.3 Concentration Response Profiles

Increasing concentrations of many of the test compounds lead to increases in CYP3A4 mRNA in all donors evaluated (Fig. 1). Binimetinib, doravirine, lesinurad, rifaximin, rilpivirine, selexipag, tafamidis, tasimelteon and rifampicin resulted in concentration-dependent increases in all three donors. Some compounds resulted in “bell-shaped curves” as described in [8]. In that work, a recommendation was made to remove data points which showed a > 30% decrease in magnitude of induction when compared with the preceding concentration data due to suspected cytotoxicity. In the case of sofosbuvir, 300 µM data were removed from all donors to estimate the Emax. For lesinurad, concentrations > 100 µM were excluded. In two cases there were different effects observed across donors, where rilpivirine showed signs of cytotoxicity > 30 µM in LLT and LDQ, but > 100 µM in donor YNZ. Delafloxacin was apparently cytotoxic in donor YNZ at concentrations > 100 µM, but not LDQ until 300 µM. Delafloxacin led to concentration-dependent increases in one of three donors (LLT); sofosbuvir led to concentration-dependent increases in two of three donors.

Fig. 1
figure 1

Concentration response curves for clinically negative CYP3A inducers. The fold increase in CYP3A mRNA level (y-axis) was plotted against concentration (x-axis) to derive the in vitro induction parameters. Each datapoint represents the mean and standard deviation for n = 3. Raltegravir is represented by a blue circle, sofosbuvir by a red square, binimetinib by a green upward triangle, delafloxacin by a purple upside-down triangle, rifaximin by a black circle of a larger size, selexipag by a brown square of a larger size, tafamidis by a blue upward triangle of a larger size, tasimelteon by a darker purple upside-down triangle of a larger size, acebilustat by a maroon diamond, tenofovir by a small hunter green circle, lesinurad by a small yellow star, rilpivirine by a small green cross and rifampicin by a small blue x. Panel A depicts the concentration response profile with Donor LLT, Panel B Donor LDQ and Panel C Donor YNZ. CYP cytochrome P450, mRNA messenger ribonucleic acid

3.4 Confirmation of True Negative In Vitro Inducers

True negative in vitro induction was confirmed by evaluating whether there was a statistically significant dose response and whether it translated to a meaningful fold induction over the tested concentrations. For the negative clinical inducers tested within this dataset, raltegravir, tenofovir and acebilustat were confirmed as true in vitro negatives in all three donors, whereas delafloxacin was negative in two donors and sofosbuvir negative in one donor (Table S6).

3.5 Induction Parameter Estimation

In the donors with concentration-dependent increases, the induction parameters could be estimated using the sigmoidal three-parameter model, Eq. (3), (Table 2). The average induction parameters were also calculated from the three donors and are designated as the “mean ± SD.” The data presented in Table 2 represent the induction parameters estimated using the nominal (added) concentrations. Rifampicin was used as a positive control inducer, and the donor-derived induction parameters were used for the in vitro to in vivo extrapolation (IVIVE) assessment. The induction parameters were also corrected based on the average concentration over the incubation and considering the estimated in vitro binding (Table 1). In general, the donor designated LDQ resulted in the highest magnitude of change (Emax) and the greatest potency (EC50). Based on this, LDQ was the most sensitive donor. The other two donors showed similar Emax/EC50 values although there was variability in the absolute EC50 and Emax values derived across the three donors. Consistent with literature reports, the variability in interindividual donor response was as high as 20-fold for both EC50 and Emax values [4]. Of note, a similar magnitude of variability in intraindividual donor response was reported in the same paper. In contrast, repeat experiments (> 5) using these donors, within this laboratory, reproducibly yielded fold induction values of rifampicin at 10 µM between 5 and 22% CV. By comparing the Emax/EC50, the donor variability was reduced to within eightfold.

Table 2 Summary of nominal derived induction parameters

3.6 Compilation of Literature-Reported Data

To better understand where translation between true negative and true weak induction was most accurate, the induction parameters for “clean” inducers were collated [4]. Clean inducers were those which showed in vitro induction and were further determined to not be in vitro reversible or time-dependent inhibitors. This was to ensure that induction was treated in isolation of competing mechanisms of DDI. The parameters derived from the most sensitive donor are reported in Table 3.

Table 3 Summary of nominal in vitro parameters for true positive clinical inducers used in the analysis

3.7 Assessment of Correction Factor to F2

The concentration resulting in twofold induction of mRNA (F2) is proposed as a criteria in the EMA guideline on DDI [16]. Here, the in vitro induction is positive if a concentration-dependent increase greater than twofold (> 100%) is observed at concentrations < 50-fold the unbound Cmax. Likewise, the data can be considered negative if F2/Cmax,u < 50. The ability to bin true positives and true negatives based on the F2 value was evaluated at 50, 30, 20, 15, 12, 10 and 5× Cmax,u (Table 4) as was the proportion of studies conducted unnecessarily (PPE) and the proportion of studies that were not conducted but should have been (NPE). For this assessment, the clinical studies were reduced to 28 TP and 16 TN by removing repeated dose levels of rifampicin (mean AUCR = 0.104 was used). When the induction parameter (F2) was corrected for in vitro binding and metabolic stability, the maximum number of false positives could be reduced from 12 to 8 by decreasing the Cmax,u multiplier from 50 to 10 with no impact on the number of false negatives. When a correction factor of 5 was used, there were two false negatives including one of five trials with bosentan and oxcarbazepine. When the impact of binding or metabolic stability was not considered pleconaril was a false negative even at 50-fold Cmax,u. When reported PPB values were used rather than fu 0.01, efavirenz, Cmpd 11, Cmpd 2 and rilpivirine were all FN at cutoffs of 20, 5, 8 and sixfold Cmax,u, respectively. There was some improvement in the number of FP when using the F2 values derived from YNZ; however, this also corresponded to an increase in the number of FN values (Table S7). In general, the F2 value reliably predicted true clinical induction using a cut-off value of 10.

Table 4 Summary of F2 findings for the full in vitro inducer set (Tables 2 and 3) when corrected for metabolic stability and binding

3.8 Assessment of Correction Factor to R3

The R3 equation is proposed by the FDA and PMDA as an alternative basic model for evaluating the induction risk (Eq. 4), and at present it is recommended by both agencies to incorporate a tenfold correction factor to the Cmax,u [10, 16, 17]. Varying correction factors (10, 5, 2 and 0) were evaluated for application of the R3 equation to determine which value might reduce the number of FP while having no impact on the number of FN (Table 5). The clinical dataset contained 16 TN studies and 22 TP. When incorporating the individual induction parameters EC50 and Emax into the R3 equation with a correction factor of twofold to the Cmax,u, there were no FNs. The best quantitative predictions were achieved with YNZ with 86.1% of the dataset falling within twofold of the predicted value and 33.3% falling between 0.8- and 1.25-fold of the observed. When there was no correction factor applied to the Cmax,u, there was only one FN (pleconaril) across each donor, and the FP number could be reduced to 7. Having no correction factor also resulted in the best quantitative predictions measured by the GMFE closest to 1 and the lowest RMSE values. When the nominal induction parameters from the most quantitative donor were used there were two FNs, pleconaril and oxcarbazepine, and the number of FP was 5. In this case the accuracy of prediction improved from 86.1 to 91.7% within twofold and from 33.3 to 47.2% between 0.8 and 1.25-fold of the observed (Table 5 and Table S8). When reported PPB values were used rather than fu 0.01, one trial of nine for efavirenz was considered a FN with twofold correction to Cmax,u. Visual representations for the performance of the R3 equation are presented in Panels A of Fig. 2 for LLT, LDQ, YNZ and mean donor parameters, respectively.

Table 5 Summary of R3 findings for the full in vitro inducer set (Tables 2 and 3) when corrected for metabolic stability and binding and using AUCR of < 0.8 as the positive cut-off value
Fig. 2
figure 2

Predicted versus observed change in AUC. The predicted AUC change (y-axis) was compared with the observed AUC change (x-axis) for the clinical induction set using the induction parameters derived for donor LLT, LDQ, YNZ and mean in combination with the induction parameters reported in Kenny et al. [4]. The red line represents the line of unity, and the dotted green line represents twofold above the observation; the purple dotted line represents twofold below the observation. The individual blue dots represent individual clinical study results. The first panel shows the results when applying the R3 equation with 2× Cmax,ssu; the center panel shows the results when employing the generic RIS equation and the far right panel the results when applying the mechanistic static model using input parameters described as MSM 4. AUCR area under the curve ratio, Cmax,ssu unbound peak plasma concentration at steady state, R3 basic static model, RIS relative induction score, MSM mechanistic static model

3.9 Assessment of RIS and Proposed Cut-Off for AUCR

The relative induction score (RIS) is an alternative approach to evaluate induction risk which leverages the in vitro response derived from (Eq. 5) to the observed clinical data. The RIS curve established to support the data analysis in previous studies from the IQ IWG [4] (Eq. 6) was evaluated using the same clinical dataset used in the R3 analysis (16 TN and 22 TP). The best predictions were achieved when the parameters from donor YNZ were used. While there were still eight FPs, there were no FNs when a cut-off of < 0.7 for predicted AUCR was used (corresponding to 30% decrease in AUC). Additionally, up to 97% of predictions were within twofold of the observed (Table 6). Furthermore, using reported PPB values rather than 0.01 resulted in no FN. If the % decrease in AUC was shifted to 20% (AUCR = 0.8), then there were no FNs and the FP dropped from 9 to 8 for LDQ and from 8 to 5 for YNZ, using nominal derived induction parameters (Table S8). Visual representations for the performance of the R3 equation are presented in Panels B of Fig. 2 for LLT, LDQ, YNZ and mean donor parameters, respectively.

Table 6 Summary of RIS findings for the full in vitro inducer set (Tables 2 and 3) when corrected for metabolic stability and binding and using AUCR of < 0.7 as the positive cut-off value

3.10 Assessment of the Mechanistic Static Model (MSM)

Regulatory agencies include the MSM as a basic method for clinical induction risk but recommend that it not be used for combined mixed mechanism (i.e., inhibition and induction) predictions. Since the in vitro inducers selected in this study were not mixed mechanism perpetrators, only the induction portion of the MSM was included. The entire clinical dataset was used for the MSM since it considers substrate specificity in the equation and includes 16 TN trials and 64 TPs. For direct comparison of MSM4 with the R3 and RIS models, the dataset was also reduced to align the clinical trials tested. The proprietary compounds collected by the IQ-IWG did not include data needed to calculate the hepatic inlet concentration (dose or MW), so they were excluded from the analysis, resulting in 16 TN and 16 TP studies. The MSM strongly overpredicted the clinical induction risk particularly as written in the DDI guidelines (Model 1), with all donors resulting in a high FP rate (15 or 16/16 TN trials) and minimal quantitative accuracy. There were two underpredictions when using budesonide as the CYP3A substrate and rifampicin as the strong inducer. Budesonide is strongly induced by rifampicin, to a greater extent than midazolam (decrease in AUC of 99.7% vs. ~ 90%, respectively). The MSM accurately predicts a strong induction but underpredicts the magnitude when comparing AUCR. Modifying the gut concentrations based on PPB (Model 2) slightly improved the predictions but they were still largely overpredicted. Further modification of input parameters, including using the unbound maximum hepatic inlet concentration as the gut concentration and unbound systemic concentration as the hepatic input (Model 3), resulted in reduction of FP and iterative improvement in the quantitative risk assessment; however, there was still a tendency for strong overprediction of induction magnitude. Applying the average concentrations rather than the maximal concentrations (Model 4), which is more aligned with PBPK dynamic modeling approaches, provided the best quantitative prediction of all the MSM evaluated and resulted in 0 FNs (Table 7). Model 4 was applied to the same dataset used in the R3 and RIS analysis to enable direct comparisons of performance (Table 8). Model 4 resulted in a comparable number of FPs but slightly lower GMFE and %within quantitative accuracies and higher numbers of overpredictions than both the RIS and R3 models. When considering the desire to limit the number of FNs, decrease the number of FPs and improve the quantitative accuracy, the basic model ranking followed RIS > R3 (2 × Cmax,ss) > MSM (Model 4). Visual representations for the performance of the R3 equation are presented in Panels C of Fig. 2 for LLT, LDQ, YNZ and mean donor parameters, respectively. The full dataset which incorporated similar dose levels with different substrates used for the positive inducers is displayed in Fig. 3.

Table 7 Summary of MSM findings using the full dataset and various input parameters
Table 8 Summary of MSM Model 4 findings using the same dataset as R3 and RIS
Fig. 3
figure 3

Predicted versus observed change in AUC for an expanded substrate set using mechanistic static model 4. The predicted AUC change (y-axis) was compared with the observed AUC change (x-axis) for the clinical induction set using the induction parameters derived for the average donor induction parameters in combination with the induction parameters reported in Kenny et al. [4]. The red line represents the line of unity, and the dotted green line represents twofold above the observation; the purple dotted line represents twofold below the observation. The individual blue dots represent individual clinical study results. The upper left panel shows the results for donor LLT, the upper right shows the results for donor LDQ, the lower left shows the results for donor YNZ, and the lower right shows the results when applying the average donor induction parameters. AUCR area under the curve ratio

3.11 Internal Calibration of Induction Parameters Considering the Simcyp Approach and Provided Parameters for Rifampicin

An additional method, proposed for PBPK modeling, includes calibration of the in vitro derived induction parameters with those reported for rifampicin, the goal being to reduce the contribution of inter-donor variability on the magnitude of induction achieved. In this approach, the optimized EC50 and Emax values used for rifampicin within Simcyp (EC50 = 0.32 and Emax = 16) are divided by the observed values to determine donor-specific scalars of each parameter, which are then applied to those parameters derived from the other inducers. For this evaluation, the mean data were used (Tables S9 and S10). There was a tendency for the calibrated data to provide a more conservative measure of induction risk, resulting in a greater number of FP values, increased overestimations, decreased quantitative accuracy and GMFE and increased RMSE.

3.12 Evaluation of Performance for Expanded Positive Inducer Set

Use of rifampicin is currently restricted because of the presence of a potentially carcinogenic impurity 1-methyl-4-nitrosopiperazine (MNP), with recommended intake levels below those detected in available batches precluding its use as a strong inducer [21]. Other strong inducers including carbamazepine and phenytoin have been proposed as potential alternatives, as has the possibility to use moderate inducers to better understand the impact of induction on a CYP3A substrate. To evaluate the translation of the proposed models to these positive clinical inducers, the geometric mean values from the IQ-IWG [4] paper were used in the assessment as they resulted in the best predictions for quantitative risk assessment (compared with worst case donor, mean and median values) (Table 3). In total, 22 additional clinical studies were collected, with 5 TP for carbamazepine (AUCR range 0.124–0.603), 9 TP and 3 TN for efavirenz (AUCR range 0.240–0.929), 1 TP study with phenobarbital (AUCR = 0.394 with nifedipine) and 4 TP studies for phenytoin (AUCR range 0.105–0.607). All TP studies were appropriately binned using an F2 cut-off value of 10× Cmax,u. When applying the R3 equation with a twofold correction factor, three of the five carbamazepine studies were predicted within twofold of the observed. All the efavirenz studies except one were predicted within twofold although the three TN studies were FP. The phenobarbital study was correctly predicted. Three of the four phenytoin studies were predicted within twofold. The RIS model yielded very similar observations as the R3 equation with 81.8% of studies predicted within twofold. The MSM 4 was the only model able to accurately capture two of the three TN efavirenz studies and quantitatively predicted the results with 77.3% within twofold. Overall, the recommendations made from the results of the clean inducer set were consistent with the outcome of the expanded true positive inducer set. In addition, the data demonstrate that for carbamazepine and efavirenz, which both show reversible inhibition of CYP3A in vitro, using the induction parameters alone, predict the observed clinical outcome well. Calibration to rifampicin was also investigated and consistent with the observations from the clean inducers tended to provide a more conservative estimate of risk (Table S11). Visual representations for the performance of the equations are presented in Figs. S2 and S3 for the mean donor parameters, with and without rifampicin calibration, respectively.

4 Discussion and Conclusion

The potential for drug-drug interactions (DDI) is often evaluated as part of early discovery and development using a series of well-defined in vitro assays. The relative importance of any DDI liability is considered among a plethora of additional factors; therefore, it is not uncommon to accept a potential DDI as part of a comprehensive data package aligned to effectively address patient needs. Therefore, to aid informed decision making, including the appropriate timing of clinical DDI studies, it is of great interest to identify accurate translation of in vitro DDI data for projection of clinical outcome.

There have been numerous publications evaluating the translation of in vitro derived induction parameters to predict clinical risk [1, 2, 4, 22, 23]. Of note, despite a tendency for overprediction, there were still several false negative inducers when applying the basic models without consideration of in vitro binding or stability [4]. False negatives included dexamethasone, pleconaril, oxcarbazepine, Cmpd 2, 11, 15, efavirenz, lopinavir, nevirapine and troglitazone. Dexamethasone is a unique outlier in that hepatocyte incubations for in vitro induction studies typically contain low levels, which are added to promote phenotypic architecture, suppress proliferation, inhibit induction of stress signaling pathways and aid in expression of liver-enriched transcription factors, including those nuclear receptors required for DMET induction [24,25,26,27,28,29]. Thus, it is likely that the dexamethasone parameters are under (Emax) or over (EC50) estimated under standard incubation conditions.

In line with the regulatory recommendations, a maximal value of 0.01 was applied for fu. When reported binding values were used, there was an increase in FN values using the F2 equation with 10× Cmax,u including all trials with efavirenz. There were no FNs using the RIS approach. Metabolic stability, as recommended in the EMA guideline [16], was also included. While there is no guidance offered on how to apply these data, there is emerging literature evidence that it is important to consider the temporal nature of both induction time course and concentration of the inducer [8, 15, 30]. The data generated within this paper suggest that to reduce the potential for false negatives both the in vitro binding and the metabolic stability should be considered in derivation of the induction parameters. The use of Cav with the proposed cut-off of > 25% loss and > 25% reduction in unbound free fraction appears reasonable.

When comparing the predicted AUCR, the MSM yielded underpredictions for rifampicin with budesonide as the substrate. Rifampicin strongly decreases the AUCR of budesonide with reported AUCR of 0.003 and 0.007. The most sensitive donor projection was 0.111, which when compared directly with the observed AUCR corresponds to a 37-fold underprediction. Of note, all the rifampicin predictions are within 0.8 to 1.25-fold of observations when the %decrease in AUC is compared rather than the AUCR. It would be valuable to better understand this lack of concordance and what endpoint (i.e., AUCR vs. %decrease in AUC) is more meaningful to compare.

There are limitations of the analysis conducted herein including that assumptions were made to the correction of the positive control values based on literature data and that the dataset was limited to induction only interactions. Despite these limitations, the analysis demonstrates that reducing the correction factors to basic equations can result in improved accuracy of predictions and point to 10× and 2× for F2 and R3 equations, respectively. In addition, the best quantitative accuracy was achieved when using the RIS model with an AUCR cut-off of < 0.7 as positive. Lastly, the accuracy of the MSM was improved when the average inlet concentration was used for the gut portion and the average unbound systemic concentration was used for the hepatic input concentration, which is consistent with a recent report evaluating prediction methods for TDI of CYP3A [13].