FormalPara Key Points

Pharmacogenetic profiling and therapeutic drug monitoring (TDM) have both been proposed to manage variability in drug exposure.

For drugs with low inter-occasion variability and residual unexplained variability, TDM provides a closer estimate of true exposure, whereas for drugs with low unexplained inter-individual variability, pharmacogenetic profiling is more accurate for the majority of patients.

Genotype prevalence, the influence of pharmacogenetic differences on the pharmacokinetics (PK), the type of target exposure metrics, and the influence of PK characteristics on the capacity to measure exposure with TDM were of additional influence.

The presented quantitative framework can be used to select between TDM and pharmacogenetic testing, under different conditions or specific drug cases.

1 Introduction

Clinical trials are typically designed to identify the efficacy and safety of a novel drug on a population level [1]. However, physicians treat their patients on an individual level, where drug responses can vary widely between individuals [1]. Variability in drug exposure – the pharmacokinetics (PK) of a drug – can arise from a wide variety of factors, such as genetic differences, environmental factors, and demographic characteristics. Individuals with high drug concentrations will have a higher risk of toxicity, and patients with low drug concentrations may experience suboptimal efficacy. Large variability in PK is particularly relevant for drugs that have a (1) clear correlation between drug concentration and response, (2) low intra-individual variability (IIV), and (3) narrow therapeutic window [2].

Pharmacogenetics is a scientific discipline that explores the correlation between genetic variations and IIV in PK (Fig. 1). These genetic variants can be assessed before commencing drug therapy, thereby serving as an a priori biomarker for personalized dosing [3]. The field of pharmacogenetics is rapidly expanding, and its clinical integration has become increasingly feasible with the proliferation of pharmacogenetic tests [4], the development of clinical guidelines, and the downward trajectory of costs [5]. An illustrative example of this progress is the growing utilization of dihydropyrimidine dehydrogenase (DPD) deficiency testing to inform the dosing of fluoropyrimidines [6], as DPD is an enzyme responsible for the metabolism of chemotherapeutic agents like 5-fluorouracil (5-FU) and capecitabine. Patients with diminished DPD enzyme activity experience elevated chemotherapy exposure and more pronounced adverse effects, highlighting the necessity of DPD deficiency testing prior to initiating certain chemotherapies. In addition, pharmacogenetic tests commonly target cytochrome P450 (CYP) enzymes, which exert an influence on drug clearance and the formation of active metabolites. Although genetic variations account for a portion of the IIV in PK, the predictive capacity of these tests can be limited due to substantial unexplained IIV or the fact that they most often are related to a single PK parameter, such as clearance (Fig. 1), particularly when the exposure metric of interest often relies on multiple PK parameters, e.g., the trough concentration (Ctrough).

Fig. 1
figure 1

Visual representation of the sources of variability and impact on pharmacokinetics. Pharmacokinetic profiles of tacrolimus representing different sources of variability. The black lines represent the typical (no IIV) profiles for the CYP3A5 expressers (dashed line: CYP3A5*1/*1 or CYP3A5*1/*3) and the CYP3A5 non-expressers (solid line: CYP3A5*3/*3). The solid colored lines represent the profiles for two different patients, one CYP3A5 expresser (blue line: individual with CYP3A5*1/*1) and one CYP3A5 non-expresser (orange line: individual with CYP3A5*3/*3), deviating from their corresponding typical CYP3A5 profiles by IIV. The colored dots represent the measured concentrations for each patient, deviating from the underlying “true” patient concentration (line) by RUV. CYP cytochrome P450, IIV inter-individual variability, RUV residual unexplained error

Therapeutic drug monitoring (TDM) entails the measurement of drug concentrations at specific time intervals to guide dose adjustments. It is important to note that TDM is considered a retrospective approach, as the measurements can only be conducted after drug administration, and is therefore seen as an a posteriori method for dose individualization [3]. Within the medical disciplines of oncology, infectious diseases, cardiology, and psychiatry, explicit recommendations for TDM have been made [2], and both national and international working groups have been established to provide comprehensive guidelines for clinical application [2]. Nevertheless, the usefulness of TDM measurements can be compromised by significant inter-occasion variability (IOV) (i.e., random variability between different dosing occasions that is not related to inherent changes in any of the PK characteristics) and residual unexplained variability (RUV) (i.e., related to errors or variability in the handling of the sample, bioanalysis, model misspecification, or other intra-individual factors) (Fig. 1). Moreover, the economic cost, patient inconvenience associated with sample collection, and the requirement for assays with reasonable turnaround times can present practical limitations.

Despite extensive research exploring the clinical feasibility of pharmacogenetics and TDM, there has been limited focus on directly comparing these two approaches. Most clinical studies have primarily focused on evaluating one method independently, and there is a lack of head-to-head trials comparing their efficacy. One potential explanation for this gap is the anticipated small effect size and, consequently, the need for a large number of participants to achieve adequate statistical power [7]. However, the absence of direct comparisons can pose challenges in scenarios where both pharmacogenetic and TDM recommendations exist for a particular therapy. Implementing both pharmacogenetic testing and TDM could potentially offer a viable solution in such cases; however, practical constraints and resource scarcity may limit the feasibility of simultaneous implementation [8].

In silico PK studies offer a valuable means to evaluate exposure approximations for both pharmacogenetic profiling and TDM, bypassing practical, financial, and ethical constraints associated with real-life clinical trials [9]. This study aimed to quantitatively assess the circumstances in which pharmacogenetic profiling may outperform TDM in estimating drug exposure, taking into account three sources of variability: IIV, IOV, and RUV. TDM was considered in its commonly implemented form in clinical practice, involving measurements of Ctrough (or area under the curve [AUC] through non-compartmental analysis [NCA]) without the application of a population pharmacokinetic (popPK) model. Several therapeutic scenarios, where both pharmacogenetic testing and TDM-based dose individualization have been proposed, were examined to identify the conditions under which one approach provides more accurate exposure predictions than the other. Based on the findings, clinical recommendations are formulated for individualized dose adjustments in the investigated cases.

2 Methods

2.1 Literature Search and Model Selection

A literature search was performed in PubMed in March 2023 to retrieve PK models published after 2010 that could be employed for simulations of both pharmacogenetic and TDM-guided dosing. The search strategy included relevant single-nucleotide polymorphisms (SNPs) known to exert clinically significant effects on PK, as defined by the Dutch Pharmacogenetic Working Group and the Clinical Pharmacogenetics Implementation Consortium. The specific genes included in the search were ABCG2, CYP2B6, CYP2C9, CYP2C19, CYP2D6, CYP3A4, CYP3A5, DPD, MTHFR, NUDT15, SLCO1B1 (OATP1B1), TPMT, and UGT1A1.

The initial screening of publication titles and abstracts was performed using Rayyan QCRI by author NR, and an independent review of the results was conducted by author MC. The inclusion criteria consisted of (1) identification of PK models based on the search terms, (2) determination of whether SNPs relevant to drug exposure were incorporated as covariates within the model structure, and (3) selection of models for drugs used in TDM. In this context, TDM was defined as the direct measurement of drug concentrations without any model-based estimations, such as model-informed precision dosing (MIPD). Exclusion criteria for the identified publications included non-English language publications, physiology-based PK models, drugs not commonly used in Dutch or Swedish clinics, and models based on fewer than 30 patients.

In cases where multiple models for a specific drug were available, the model with the highest number of samples and patients, along with its potential for accurate predictions and clinical relevance, as assessed by both NR and MC, was chosen. All selected models were then applied in PK simulations to evaluate their concordance with reported observational PK values. If necessary, adjustments were made to the models to align them with the research focus.

2.2 Virtual Patients

The selected models and corresponding parameter estimates were coded in mrgsolve (version 0.11.2) in RStudio (version 4.1.1.) [10,11,12]. For each drug case, a virtual population of 1000 patients was created. Pharmacogenetic phenotypes were randomly sampled according to the original patient distribution or, in case this was not available, from the Genome Aggregation Database (gnomAD) [13]. For simplicity, other covariates were fixed to a single value (e.g., weight and co-medication): continuous variables at the median value; for categorical variables, the reference value was fixed to the mode. By comparing patients to their own baseline data across simulations that only incorporated variations in IIV, IOV, and RUV, the significance of having varying covariates among patients was less critical in this analysis. The dosing regimen used in the simulations was the one specified in the original publication of the model. In the situation that the model was based on multiple dosing categories, the median dose was selected.

Simulations for each drug case were performed in two steps. During simulation step 1 (SIM1), a single model-based simulation was performed using the reported typical and variability parameters. Each PK parameter for individual i (\({\theta }_{i}\), e.g., clearance) is here represented by a fixed effect component (\({\theta }_{1}\), i.e., the typical value), multiplied by a random effect component representing the deviation of \({\theta }_{i}\) from \({\theta }_{1}\) via a log-normal distribution. However, rather than sampling the random effect component (\({\eta }_{i}\)) from a normal distribution with a mean of 0 and the reported standard deviation of ω, the distribution was changed to a mean of 0 and standard deviation of 1. Following on, \({\eta }_{i}\) was multiplied by θ1,IIV with the value of ω to capture the original deviation of \({\theta }_{i}\) from the typical value \({\theta }_{1}\) (Eq. 1).

$$\theta_{i} = { }\theta_{1} { } \times e^{{\theta_{{1,{\text{IIV}}}} { } \times { }\eta_{i} }} .$$
(1)

Each PK parameter for individual i of time j (\({\theta }_{ij}\)) is then represented by \({\theta }_{i}\) multiplied by a random effect component, representing the deviation of \({\theta }_{i}\) from \({\theta }_{ij}\) via a log-normal distribution. However, rather than sampling the random effect component (\({\kappa }_{ij}\)) from a normal distribution with a mean of 0 and the reported standard deviation of π, the distribution was changed to a mean of 0 and standard deviation of 1. Following on, \({\kappa }_{ij}\) was multiplied by \({\theta }_{\text{IOV}}\) with the value of π to capture the original deviation of \({\theta }_{ij}\) from \({\theta }_{i}\) (Eq. 2).

$$\theta_{ij} = { }\theta_{1} { } \times e^{{ \theta_{{\text{IOV }}} \times \kappa_{ij} }} .$$
(2)

The influence of a covariate value of individual i (\({\text{COV}}_{i}\)), such as the influence of pharmacogenetic CYP3A5 variations (\({\text{COV}}_{\text{CYP}3\text{A}5,i}\)) on drug clearance, was added as a relative deviation from \({\theta }_{ij}\) for each affected parameter unless otherwise specified (Eq. 3).

$$\theta_{ij} = { }\theta_{1} { } \times e^{{\theta_{{{\text{IIV}}}} { } \times { }\eta_{i} + \theta_{{\text{IOV }}} \times \kappa_{ij} }} \times {\text{COV}}_{i} .$$
(3)

The difference between the observed concentration (\({\text{Concentration}}_{ij}\)) and model-predicted individual concentration for individual i at time j (\({\text{IPRED}}_{ij}\)) is represented by a random effect component, either additively (additive error) or in relation to the magnitude of the \({\text{IPRED}}_{ij}\) (proportional error). However, the distribution of these random effects (\({\varepsilon }_{ij})\) was changed to a mean of 0 and standard deviation of 1 (instead of the reported standard deviation of σ). Following on, \({\varepsilon }_{ij}\) was multiplied by \({\theta }_{\text{RUV}}\) with the value of σ to capture the original difference between \({\text{Concentration}}_{ij}\) and \({\text{IPRED}}_{ij}\) (Eq. 4, additive error; Eq. 5, proportional error).

$${\text{Concentration}}_{ij} = {\text{ IPRED}}_{ij} { } + { }\theta_{{{\text{RUV}}1}} { } \times \varepsilon_{1ij}$$
(4)
$${\text{Concentration}}_{ij} = {\text{ IPRED}}_{ij} { } + {\text{ IPRED}}_{ij} { } \times { }\left( {\theta_{{{\text{RUV}}2}} { } \times \varepsilon_{2ij} } \right).$$
(5)

Because all random effect distributions were fixed to a mean of 0 and a standard deviation of 1, the simulated ηi, κij, and εij values represent the individual z-scores for each PK parameter (i.e., number of standard deviations from the mean). These individual z-scores were exported for simulation step 2 (SIM2), along with individual values of \({\theta }_{ij}\) and \({\text{COV}}_{\text{SIM}1i}\).

2.3 Simulation of Biomarker Effect

In SIM2, the magnitudes of IIV, IOV, and RUV were varied (i.e., by changing the \({\theta }_{\text{IIV}}\), \({\theta }_{\text{IOV}}\), and \({\theta }_{\text{RUV}1/2}\) values from Eqs. 1, 2, 3, 4, and 5, respectively) in each simulation. All other conditions remained the same, including the patient population, their individual z-scores, and the time point at which exposure was measured. This was done to evaluate their influence on the predicted drug exposure by the pharmacogenetic and TDM methods. IIV and IOV were only changed for the parameter of interest (e.g., clearance), while the original variability values were kept for all other parameters. The values of IIV, IOV, and RUV were varied in each simulation using different combinations of magnitudes, ranging from 0 to 1 in 0.1 increments for IIV, ranging from 0 to 0.5 in 0.1 increments for IOV, and ranging from 0 to 0.3 in 0.05 increments for RUV (proportional), unless otherwise specified. For the additive error model, a range between 0 and 0.5 of the average concentration achieved during steady state (Cavg,ss) was evaluated, in 0.05 increments. This resulted in 463 simulated scenarios (i.e., 6 × 7 × 11 + the originally reported IIV, IOV, and RUV values) for each of the original 1000 individuals, per drug case.

For the parameter of interest (e.g., clearance), the influence of the covariate of interest (e.g., CYP450 genotype/phenotype) was scaled to the original value (\({\text{COV}}_{\text{SIM}1i}\) from SIM1). This was done to ensure that the individual parameter value remained identical, whilst only the relative contribution of the random effect components (\({\theta }_{\text{IIV}} \times {\eta }_{i }{ + \theta }_{\text{IOV }}\times {\kappa }_{ij}\)) and the fixed effect component (i.e., \({\text{COV}}_{\text{SIM}2i}\)) differed between the subsequent simulation rounds. As such, the total variability between patients remained consistent throughout the simulations and only the relative contribution of unexplained IIV (through θIIV × ηi) and explained IIV (through the COVi) changed. The covariate factor for each simulation COVSIM2 was calculated as follows (Eq. 6):

$${\text{COV}}_{{{\text{SIM}}2i}} = \frac{{e^{{\left( {\theta_{{{\text{IIV}},{\text{SIM}}1}} { } \times { }\eta i{ } + { }\theta_{{{\text{IOV}},{\text{SIM}}1}} { } \times { }\kappa ij} \right)}} }}{{e^{{\left( {\theta_{{{\text{IIV}},{\text{SIM}}2}} { } \times { }\eta i{ } + { }\theta_{{{\text{IOV}},{\text{SIM}}2}} { } \times { }\kappa ij} \right)}} }} x {\text{COV}}_{{{\text{SIM}}1i}} .$$
(6)

For each simulated individual (n = 1000) and under each simulation scenario (n = 463), the three different values (i–iii) of the relevant summary exposures (Ctrough or AUC, depending on the drug) were exported (Fig. 1):

(i) The true value, including the impact of all covariates (i.e., explained IIV) and unexplained IIV (Eqs. 1 and 3), but without the effect of IOV and RUV (i.e.,\({\theta }_{\text{IOV}}\text{ and }{\theta }_{\text{RUV}}\) = 0)

(ii) The pharmacogenetic predicted value, including the impact of all covariates (i.e., explained IIV) (Eq. 3), but without the effect of unexplained IIV, IOV, and RUV (i.e., \({\theta }_{\text{IIV}}, {\theta }_{\text{IOV}},\text{ and }{\theta }_{\text{RUV}}\) = 0)

(iii) The TDM predicted value, including the impact of all covariates (i.e., explained IIV), unexplained IIV, IOV, and RUV (Eqs. 1, 2, 3, 4, and 5).

Ctrough was here represented as a single trough concentration at steady state. For TDM of the AUC, the exposure was calculated after a single dose by the trapezoidal method (NCA) using three optimized time points at a single dose interval (i.e., a limited sampling approach) as earlier proposed [14, 15]. IOV was excluded from the (i) true value as it was considered a random value that cannot be predicted for an individual from one occasion to another, consistent with previous research [16].

2.4 Graphical and Numerical Evaluation

Summary exposures, as predicted by pharmacogenetic and concentration measurements, were compared to the “true” PK value (Eq. 7).

$${\text{Accuracy }} = {\text{ Exposure}}_{{{\text{predicted}}}} /{\text{Exposure}}_{{{\text{true}}}} .$$
(7)

For each individual, the approach (pharmacogenetic vs. TDM) that predicted an accuracy closest to 1, meaning that the predicted exposure is closest to the true exposure, was considered superior. This study did not explore dose adjustments based on the TDM and pharmacogenetic testing results. The percentage of the 1000 patients for whom the pharmacogenetic biomarker performed better than TDM was computed for all simulated scenarios. The final results were visualized in 3D plots using R package plot3D (version 1.4).

2.5 Comparing NCA and MIPD for AUC

In order to address concerns associated with the calculation of AUC using NCA [17], the potential for improved accuracy offered by MIPD was assessed for drugs in which AUC had been suggested to be the relevant exposure metric (specifically, vincristine, and 5-FU). For MIPD, the same three concentration measurements utilized in the NCA were inputted into the corresponding PK models to estimate individual AUC values in NONMEM (version 7.4). The AUC values obtained through both NCA and MIPD were compared to the true values in order to evaluate their accuracy:

$${\text{Accuracy}}_{{{\text{MIPD}},{ }i/{\text{NCA}}, i}} = \frac{{{\text{AUC}}_{{{\text{MIPD}}, i/{\text{NCA}},{ }i}} }}{{{\text{AUC}}_{{{\text{true}},i}} }}.$$
(8)

3 Results

3.1 Literature Search

A total of 586 articles were identified, out of which, 40 were found to adhere to the predefined selection criteria based on title and abstract. Of the remainder, 35 model articles were excluded due to the availability of an alternative model for the same drug that better fit the criteria. In the end, six articles, each corresponding to one drug (tacrolimus, tamoxifen, efavirenz, 5-FU, risperidone, and vincristine), were selected for the simulations.

Tacrolimus: A two-compartment tacrolimus model by Andrews et al. was selected [18], which was built on rich sampling in a relatively large patient population (i.e., five to ten samples per patient and occasion, 4527 samples from 337 patients) (Table 1). Compared to the other models, this model estimated all parameters, including separate influence of CYP3A5 and CYP3A4 variations on the PK. Furthermore, the model incorporated an estimation of IOV on clearance. The model accounts for the 1.6-fold increase in tacrolimus clearance for CYP3A5 expressers (CYP3A5*1/*1 or CYP3A5*1/*3), as compared to CYP3A5 non-expressers (CYP3A5*3/*3). This is in line with the guideline recommendation advocating the administration of 1.5–2.5 times the standard dose for individuals expressing CYP3A5 [19].

Table 1 Overview of included pharmacokinetic models

Tamoxifen: A PK model structure by Klopp-Schulze et al. was selected describing tamoxifen by a one-compartment model, linked to a second one-compartment model representing the active metabolite endoxifen [20] (Table 1). The model structure accounted for differences in endoxifen formation between patients with different CYP2D6 activity scores (AS), with a change of tamoxifen clearance into endoxifen of − 72.2% (AS = 0), − 51.0% (AS = 0.5), − 21.1% (AS = 1–1.5), and +53.3% (AS ≥ 2.5) relative to the reference value of AS = 2. For patients with AS 0–0.5, this corresponded to a reduction of 48.9–70.5% in the fraction of tamoxifen metabolized into endoxifen [20], which coincides with the guideline recommendations to increase the tamoxifen dose by twofold in these patients [19]. Simulations were conducted to assess drug concentration concerning endoxifen exposure, considering its 200-fold increased potency [20].

Efavirenz: The PK model structure by Habtewold et al. was selected describing efavirenz by a two-compartment model, linked to a second two-compartment model representing the metabolite 8-OH-efavirenz (Table 1) [21]. Compared to the other available models, the model by Habtewold et al. estimated all parameters based on the dataset and included samples at steady state. The model accounted for differences in clearance between the CYP2B6 alleles, with a +109% (CYP2B6*1/*1) and +63% (CYP2B6*1/*6) increased efavirenz clearance relative to the reference CYP2B6*6/*6 genotype. This aligns with the guidelines suggesting a reduction in the efavirenz dose to 0.33–0.66 of the standard dose for patients with the CYP2B6*6/*6 genotype [19]. Simulations of drug concentration were made for efavirenz exposure only, as the metabolite 8-OH-efavirenz has limited activity.

Risperidone: A PK model structure by Vandenberghe et al. was selected describing risperidone by a one-compartment model, linked to a second one-compartment model representing the active metabolite paliperidone [22]. Compared to the other PK models, the model by Vandenberghe et al. was estimated on concentrations after first dose and at steady state, in addition to assessing multiple relevant CYP2D6 variants. The model accounted for the differences in CYP2D6 metabolism of risperidone to paliperidone, through a parameter representing the fraction of risperidone dose converted to paliperidone from the depot. The fraction of dose converted into paliperidone by first-pass effect was 0.93 (CYP2D6 extensive and ultra-rapid metabolizers), 0.85 (CYP2D6 intermediate metabolizers), and 0.07 (CYP2D6 poor metabolizers). This resulted in a 37% higher total AUC for the CYP2D6 poor metabolizers, which to a certain extent corresponds to the recommended 50–67% risperidone dose reduction for these patients [19]. Simulations of drug concentration were made for both risperidone and paliperidone, as they share similar potency. The influence of CYP2D6 genotype on the fraction for individual i (\({\text{FR}}_{i}\)) converted into paliperidone was calculated through an expit function (Eq. 9). Additionally, to account for the reduced influence due to the expit function, the values of IIV and IOV were varied in each simulation using different combinations of magnitudes, ranging from 0 to 15 in 1.5 increments.

$${\text{FR}}_{i} = \frac{1}{{1{ } + { }e^{{\left( {2.6{ } + {\text{ COV}}_{{{\text{CYP}}2{\text{D}}6}} { } + \theta_{{{\text{IIV}}}} { } \times { }\eta i + { }\theta_{{{\text{IOV}}}} { } \times { }\kappa ij} \right)}} }}$$
(9)

5-Fluorouracil: The two-compartmental 5-FU model by van Kuilenburg et al. was selected as it accounted for the non-linear PK and related DPD genotypes (DPYD) to reduce clearance in carriers of the inactivating c.1905 + 1G > A mutation [23] (Table 1). Several adjustments were made to the model:

(1) Apart from the maximum rate of elimination (Vmax) values for wild-type and heterozygous patients in the models, a third Vmax value was introduced to account for the severely reduced clearance in the subpopulation with homozygosity for the c.1905 + 1G > A mutation. By considering the known DPD enzyme activity levels for both the normal subpopulation and the severely deficient subpopulation, along with the Vmax value for the normal subpopulation, the Vmax for the subpopulation with severely reduced DPD enzyme activity was derived. The coefficient of variation (CV%) for the two subpopulations was assumed to be the same, based on the premise that similar sources of variability influencing Vmax in individuals with normal and deficient DPD enzyme activity would equally affect those with severely reduced DPD enzyme activity.

(2) To ensure uniformity of typical PK parameters among all patients, except for Vmax, each typical value was computed using the following methodology:

$${\text{PK estimate }} = \frac{{\left( {{\text{PK}}_{{{\text{control}}}} {*}N_{{{\text{control}}}} } \right) + \left( {{\text{PK}}_{{\text{heterozygous }}} {*}N_{{\text{heterozygous }}} } \right)}}{{N_{{{\text{total}}}} }},$$
(10)

where \({\text{PK}}_{\text{control}}\) is the PK parameter estimate of the control group, \({N}_{\text{control}}\) is the number of individuals in the normal group, \({\text{PK}}_{\text{heterozygous}}\) is the PK parameter estimate of the heterozygous group, \({N}_{\text{heterozygous}}\) is the number of individuals in the heterozygous (DPD-deficient) group, and \({N}_{\text{total}}\) is the total number of individuals included in the study.

Vincristine: A two-compartmental vincristine model by Centanni et al. was selected (Table 1) [24]. This model accounted for the 87% increased clearance of vincristine amongst the CYP3A5 high-expresser genotype versus the low-expresser genotype [25].

3.2 Pharmacogenetics Versus TDM

The results of the simulations for each drug case are visualized in Fig. 2. A consistent effect of increase in IIV, IOV, and RUV was observed across drug cases, namely an increase in unexplained IIV skewed the results in favor of TDM, whereas an increase in IOV or RUV supports the use of pharmacogenetic testing. For some drug cases, however, the accuracy of pharmacogenetic testing over TDM remained more consistent regardless of the changes in variability. For instance, for efavirenz, the percentage of patients simulated for which pharmacogenetic testing was favored above TDM remained around 50%. Similarly, for 5-FU, the percentage of patients for which pharmacogenetic testing would be favored above TDM remained above 75% regardless of the variability. For risperidone, the influence of IIV and IOV appeared to be minor (represented by the flat horizontal surface of 50% [green]) as compared to changes in RUV, for which large differences in percentages of patients benefitting of pharmacogenetic testing were seen.

Fig. 2
figure 2

Percentage of patients benefiting from pharmacogenetic testing over TDM across different levels of IIV, IOV, and RUV. Each axis represents one source of variability—IIV (x-axis), IOV (z-axis), and RUV (y-axis) – used in the simulations. Each dot represents the result of one simulation. The color of the dots constitutes the percentage of patients for which the pharmacogenetic biomarker performed better than TDM in predicting exposure (Ctrough or AUC) displayed as 25 ± 2.5% (blue), 50 ± 2.5% (green), and 75 ± 2.5% (orange) of 1000 simulated patients. The black diamonds represent the combination of true values (i.e., reported IIV, IOV, and RUV) from the original model. AUC area under the curve, Ctrough trough concentration, IIV inter-individual variability, IOV inter-occasion variability, RUV residual unexplained error, TDM therapeutic drug monitoring

In the case of tacrolimus, tamoxifen, risperidone, and vincristine, wider ranges of percentages for which pharmacogenetic testing was favored over TDM were seen across the evaluated IIV, IOV, and RUV (Fig. 2). For tacrolimus, tamoxifen, and vincristine, with IIV values above 0.4, the majority of simulations indicated a preference of 25% (blue data points) or less for pharmacogenetic testing, regardless of the IOV and RUV. Conversely, for IIV below 0.2 in the tamoxifen and vincristine simulations, the majority of simulations indicated a preference of 75% (orange data points) or more for pharmacogenetic testing. Between the IIV values 0.2 and 0.4, the percentage of patients simulated for which pharmacogenetic testing was favored above TDM additionally depended on the degree of IOV and RUV.

In the context of comparing actual population variability values of each drug case with simulated results (summarized in Table 2), it appeared preferable to employ a TDM approach for tacrolimus, with a higher accuracy in 89.4%, and for tamoxifen, with a higher accuracy in 87.3% of patients. Conversely, pharmacogenetic testing was deemed preferable for 5-FU, as it demonstrates a higher accuracy in 100% of patients (Table 2). For efavirenz, risperidone, and vincristine, there is no clear preference for pharmacogenetic testing or TDM, as the methods appear to predict exposure equally well.

Table 2 Comparison of individuals benefiting more from pharmacogenetic testing compared to TDM across reported variability levels

3.3 Comparing NCA and MIPD for AUC

The model-based estimations in MIPD more accurately estimated the true AUC of 5-FU and vincristine, as compared to the NCA approach (Fig. 3). The figure shows reduced bias (smaller deviation from 1 for the median of the accuracy values) and increased precision (reduced variability of the accuracy values) for MIPD as compared to NCA for both drug cases. In the case of vincristine, the median accuracy (as defined by Eq. 8) was 0.93 for MIPD and 0.81 for NCA. The percentage of patients with an accuracy within the 0.8–1.25 range was 68.1% for MIPD and 34.2% for NCA. In the case of 5-FU, the median accuracy (as defined by Eq. 8) was 1.0 for MIPD and 0.88 for NCA. The percentage of patients with an accuracy within the 0.8–1.25 range was 81.0% for MIPD and 44.7% for NCA.

Fig. 3
figure 3

Accuracy of the MIPD versus the NCA approach. Box plots display the population's 25th and 75th percentiles at the ends of each box, with whiskers extending to the 2.5th and 97.5th percentiles. The horizontal continuous lines cutting through each box represent the median values, whereas the dashed lines represent the 0.8 and 1.25 accuracy cutoffs. 5-FU 5-fluorouracil, AUC area under the curve, MIPD model-informed precision dosing, NCA non-compartmental analysis, pred prediction

4 Discussion

In this study, we assessed the impact of various sources of variability in the selection between a pharmacogenetic dosing strategy and TDM, using six drug cases: tacrolimus, tamoxifen, efavirenz, vincristine, risperidone, and 5-FU. The selected cases represent drugs in clinical practice where both pharmacogenetic-guided and TDM-based dose adjustments have been proposed. Additionally, the included drugs are utilized in diverse therapeutic areas and exhibit divergent PK characteristics, which broadens the applicability of our findings. This approach ensures the relevance of our results to real-world settings and allows for a comparative analysis of the influence of different PK attributes on the choice between pharmacogenetic-guided dosing and TDM.

For the drug cases in which Ctrough was the target (i.e., efavirenz, tacrolimus, tamoxifen, and risperidone), differences in the predictive performance of the pharmacogenetic biomarker (11.0–49.2%) may be due to differences in the IOV and RUV. The percentage of patients for which pharmacogenetic testing estimated exposure better than TDM was higher for drug cases with larger (proportional) RUV (Table 2). Another reason that pharmacogenetic testing may be favored over TDM is the population prevalence of each pharmacogenetic subtype, i.e., even with high influence on the clearance, a pharmacogenetic genotype would have a small influence on a population level when the prevalence of this genotype is low in the overall studied cohort. This may explain why efavirenz and risperidone demonstrated relatively high preference for pharmacogenetic determination (49.2% and 48.1%, respectively) compared to the other drugs, as the prevalence of the two genotypes was relatively balanced (48.3% vs. 51.7% [efavirenz] and 33.9% vs. 66.1% [risperidone]). This is in contrast to the tacrolimus and tamoxifen drugs, for which one genotype dominates in the population (prevalence of 83.9% vs. 16.1% [tacrolimus] and 85.1% vs. 14.9% [tamoxifen], taking into account that several genotypes have similar effects on clearance). Lastly, the magnitude of the difference between genotypes is expected to be of relevance.

For the drug cases in which AUC was the target (i.e., 5-FU and vincristine), the accuracy of the pharmacogenetic biomarker was found to be higher than that of TDM, particularly for 5-FU. This outcome can be partly attributed to the fact that AUC calculation relied on three optimized sampling times, which may be too sparse for accurate trapezoidal calculations, particularly for drugs with multiple PK phases. This is problematic for both vincristine, which exhibits a rapid distribution (within 10 min), and 5-FU, which displays a short half-life (10–20 min), making precise measurement of exposure challenging [23, 24]. Nonetheless, three samples are regarded as relatively high within the clinical context, as obtaining a larger number of blood samples is labor-intensive and difficult in routine practice [26]. When concentration measurements were employed in MIPD to estimate exposure, more accurate AUC predictions were obtained (Figure 3), as earlier suggested [27]. Furthermore, the pharmacogenetic biomarkers are often associated with clearance, which is directly proportional to AUC in the case of linear elimination. In contrast, Ctrough is influenced by several pharmacokinetic parameters, particularly because all examined drugs with Ctrough targets are administered orally, introducing additional unexplained variability in absorption parameters. This variability can further reduce the accuracy of a pharmacogenetic approach for drugs where Ctrough is the target metric.

In terms of clinical implications, there are several considerations based on our findings. The clinical focus for 5-FU has been on pharmacogenetic biomarkers [28]. This is in line with our findings, since the rapid changes in the concentration–time profile are difficult to capture with sparse sampling. For tacrolimus and tamoxifen, our results align with the recommendations in favor of TDM [29, 30]. Although the Food and Drug Administration (FDA) is currently considering inclusion of genotyping of CYP2D6 in the tamoxifen drug label, it has been discussed that pharmacogenetic testing of CYP2D6 should be accompanied by TDM because drug exposure can vary due to changes over time [31] and other patient characteristics [30]. Similarly, for tacrolimus, pharmacogenetic testing is considered in conjunction to TDM to determine the starting dose [32]. In the case of efavirenz, risperidone, and vincristine, no clear benefit of genotyping versus TDM was identified in our results. Indeed, a study found additional benefit of combining pharmacogenetic testing and TDM for dose adjustment of efavirenz, which could be attributed to the varying levels of benefit of each approach among individual patients [33]. In contrast, for risperidone, neither pharmacogenetic-guided dosing nor TDM demonstrated a clear preference, aligning with prior research indicating potential benefits from both approaches [34]. For vincristine, research has focused on both pharmacogenetic testing [25, 35] and TDM [36]. The ongoing CHildren treated with vincristine: A trial regarding Pharmacokinetics, DNA and Toxicity of targeted therapy In pediatric oncology patients (CHAPATI) trial (ClinicalTrials.gov NCT05844670) seeks to investigate the value of TDM and genetic testing for vincristine treatment in children from Kenya. Regardless of the investigated drug, it is important that model selection is performed based on (1) the target population and (2) the biological plausibility of the parameter values.

There are a few limitations associated with this study. Firstly, the simulations depend on the suitability of the selected population PK models. For example, unaccounted comorbidities could be of relevance when translating to a real-world population. Additionally, for infrequently occurring genotypes that nevertheless exert a notable influence on PK, the findings may have reflected a restricted impact of pharmacogenetic testing on a population-level by looking at the percentage of patients for which it is beneficial. Essentially, this approach favors what is optimally effective on average, rather than catering to individual patient needs—an approach that is contradictory to the primary intent of biomarkers, which is to improve individual-level predictions. For example, while only an estimated 0.1% of the population exhibits complete DPD deficiency and may appear negligible [37], overdosing these individuals can have severe, life-threatening consequences at the individual level, making phenotyping/genotyping the favored approach for 5-FU. On a similar note, the study also made the assumption that the genotypes in the selected PK models represented all relevant variants. However, in reality, additional genotypes may exist that exert a significant impact on PK. For example, our simulation only accounted for the effect of the DPYD*2A variant on 5-FU PK, while other variants such as c.2846A>T, c.1679T>G, and c.1236G>A, which collectively represent approximately 6% of the population [38], were not considered. Consequently, the 100% better accuracy of pharmacogenetic phenotyping versus TDM for 5-FU will be lower in scenarios where limited genotype sequencing is available. While we chose models to represent the typical Caucasian patient population, it is important to note that the results may not generalize to special populations, such as pediatric patients or those taking co-medications. Models for such analyses should ideally be selected to accurately represent the target population [39]. Finally, this study evaluated the prediction of exposure based on a single occasion. However, TDM could be performed on multiple occasions to reduce the influence of IOV and RUV and provide values closer to the true exposure.

In conclusion, through model-based simulations, the circumstances in which pharmacogenetic profiling may outperform TDM in estimating drug exposure were evaluated based on three sources of variability (IIV, IOV, and RUV). As anticipated, a high unexplained IIV in combination with a low RUV and IOV skewed the results in favor of TDM, whereas a low unexplained IIV in combination with a high RUV advocates for pharmacogenetic testing. For drugs with low RUV and IOV on the parameter of interest (such as tamoxifen and tacrolimus), TDM provides a closer estimate of true exposure, whereas for drugs with low unexplained IIV and high RUV and IOV (such as 5-FU), pharmacogenetic profiling was the preferred method. However, additional factors can have an influence, such as the relative prevalence of each genotype, the type of parameter that was influenced, the relative contribution of each genotype on the PK parameter of interest, and the capacity to accurately measure exposure with TDM. This emphasizes the need for a case-by-case assessment of the benefits of TDM versus pharmacogenetic testing, for which the approach introduced in this study could be utilized. For instance, in cases involving thiopurines, comparing thiopurine methyltransferase (TPMT) testing to TDM could be valuable; in the case of amitriptyline, evaluating CYP2D6 testing against TDM could be valuable. Future efforts may explore combining pharmacogenetic biomarkers and TDM for more sophisticated dose adjustments [40,41,42]. However, integrating multiple biomarkers simultaneously may pose challenges in decision making, requiring methods like MIPD to consolidate diverse dosing recommendations, as proposed for vincristine [24], tamoxifen [43], and tacrolimus [44].