FormalPara Key Points

Comparison and external validation of four published osimertinib population pharmacokinetic (popPK) models indicated differences between the popPK models and poor population predictive performance.

A novel popPK model with good predictive performance should be developed for patients with low osimertinib exposure.

1 Introduction

With approximately 2.2 million confirmed cases and 1.8 million deaths in 2020, lung cancer is the leading cause of cancer death worldwide [1]. The majority of these cases are classified as non-small cell lung cancer (NSCLC). The last two decades have seen remarkable improvements in the treatment of NSCLC, including novel targeted therapies [mainly tyrosine kinase inhibitors (TKI)] for NSCLC with an actionable genomic alteration. The prevalence of epidermal growth factor receptor (EGFR) mutations ranges from 11 to over 49% depending on the patient population [2]. Third-generation EGFR TKI such as osimertinib are currently the preferred first-line treatment option for metastatic EGFR-mutated NSCLC as well as adjuvant treatment after complete resection of stage IB–IIIA EGFR-mutated NSCLC [3, 4].

Standard osimertinib dose is 80 mg once daily, irrespective of patient characteristics [5]. The dose can be reduced to 40 mg once daily upon unacceptable toxicity. In the FLAURA phase III trial, 43% of patients treated with osimertinib needed a dose interruption, 5% a dose reduction, and 15% permanently discontinued treatment due to adverse events (AEs) [4]. This can potentially impact treatment efficacy. Despite the fixed-dose regimen, a large inter-individual variability has been reported for osimertinib trough concentrations in steady state (Cmin,SS) [6]. In the combined analysis of three osimertinib clinical trials no clear exposure–response relationship was reported for efficacy, with very limited data available for Cmin,SS < 100 ng/ml [6]. Conversely, a linear exposure–toxicity relationship was reported [6] for some of the most prevalent AE’s, including diarrhea and rash [4]. A retrospective study suggested that, at osimertinib Cmin,SS > 259 ng/ml, risk for toxicity is increased [7]. Therefore, to prevent avoidable toxicity, without potentially compromising treatment efficacy, therapeutic drug monitoring (TDM) is suggested [7], to allow for dose management in patients at risk.

Several population pharmacokinetic (popPK) studies have been reported which support Bayesian simulations for predicting plasma concentrations for osimertinib in individual patients [6,7,8,9]. It is currently unclear which model offers the best predictive performance and which models are most suitable for nonadherence management and model-informed dosing [10]. Therefore, an external validation of all popPK models available in the current literature has been performed. The validation of the osimertinib popPK models was carried out comparably to validations of popPK models for different drugs [11,12,13]. For this external evaluation, a data subset was used from the OSIBOOST clinical trial [14].

2 Methods

2.1 Literature Search

Available PopPK models were identified using PubMed, which was accessed on 4 January 2024. Studies were only included if they presented pharmacokinetic models based on clinical pharmacokinetic data. Noncompartmental pharmacokinetic studies and animal studies were excluded. The search query is outlined in the supplementary file.

2.2 Data Collection

The data used for this study are a subset from the Dutch OSIBOOST clinical trial [14] (NCT03858491 with registration date 2019-02-27/EudraCT number 2018–004290-28), which was reviewed and approved by an independent medical research ethics committee (METC19-013). The OSIBOOST study was conducted in accordance with Good Clinical Practice guidance. Informed consent was obtained from all participants included in the OSIBOOST study. The OSIBOOST trial was a pharmacokinetic proof-of-concept trial in which 11 patients were included who received regular treatment with osimertinib 80 mg once daily for EGFR mutant NSCLC. Patients were included if they were clinically stable and had a relatively low osimertinib steady-state trough concentration (Cmin,SS ≤ 195 ng/ml) without relevant toxicity. Patients were excluded when irreplaceable CYP3A4/CYP3A5 inhibitors, inducers, or substrates were concurrently used during the study. As part of this trial, pharmacokinetic curves in steady state were obtained first for osimertinib monotherapy. Patients were then started on osimertinib/cobicistat concomitant therapy to cost-effectively boost osimertinib exposure, and, after at least 21 days, a second pharmacokinetic curve was collected. Samples were immediately processed upon collection, and were prepared on dry ice in order to prevent osimertinib degradation. Osimertinib and AZ5104 concentrations were quantified using previously described validated multiplexed HPLC-MS/MS methods [15, 16].

For the evaluation of the published popPK models, only samples taken prior to the addition of cobicistat were included (i.e., the first pharmacokinetic curve in the OSIBOOST trial). A total of 88 datapoints from 11 patients were included, 44 osimertinib and 44 AZ5104 samples. When continuous covariates were not known for individuals, the median value of the evaluated model was used.

2.3 External Model Evaluation

The identified literature popPK models were reconstructed in NONMEM version 7.4.4 and RStudio version 2023.09.01. The packages ggplot2, tidyvpc, and tidyverse were used for the diagnostic plots.

Population, individual, and simulation-based evaluations were performed. Population-based evaluation was carried out using goodness-of-fit (GoF) plots of the population-predicted concentration (PRED) plotted against the observed concentrations (OBS), Conditional weighted residuals (CWRES) were plotted against time after dose (TAD). Individual evaluations were based on GoF where individual predicted concentrations were plotted against the OBS. Simulation-based evaluation was carried out with a prediction-corrected visual predictive check (pcVPC).

Prediction errors (PE) were calculated using Eq. (1), and the absolute prediction errors (APE) were calculated using Eq. (2). The median prediction error (mPE) as a measure of bias was calculated and the median absolute prediction error (mAPE) was calculated as a measure of precision.

$${PE}_{i,j} = \frac{({PRED}_{i,j} - {OBS}_{i,j}) }{ ({PRED}_{i,j}+ {OBS}_{i,j})/2}\times 100\%$$
(1)
$$A{PE}_{i,j}= \left|\frac{({PRED}_{i,j} - {OBS}_{i,j}) }{ ({PRED}_{i,j}+ {OBS}_{i,j})/2}\right|\times 100\%$$
(2)

where i stands for the ith individual with the jth concentration.

pcVPCs were developed where the OBS and the simulated concentrations were normalized based on the median population prediction [17]. For all the models, 1000 simulated datasets were generated based on the popPK model, and the concentrations of observations and simulations were plotted against the TAD.

3 Results

3.1 Selected Models for Evaluation

In the literature search, 289 articles were found. Of these, four studies [6,7,8,9] included popPK models and met the inclusion criteria (Table S1). All four models were developed with the goal of studying the exposure–response and/or the exposure–toxicity relationship of osimertinib and its active metabolites. The respective characteristics of the models are shown in Table 1. Apart from ethnicity, there were no notable differences between the baseline characteristics of the populations used in the models and our populations set out in Table 2.

Table 1 Overview of selected population pharmacokinetic models
Table 2 Baseline characteristics of selected population pharmacokinetic models and OSIBOOST cohort

All final popPK models of the selected studies used one-compartment models. Brown et al. (model 1) opted for a one-compartment model based on preclinical data [6]. Model 2 by Ishikawa et al. selected a one-compartment model. For model 3 by Rodier et al. a two-compartment model was tested, but a two-compartment model did not improve the OFV compared to a single-compartment model. Model 4 by Agema et al. reported that a two-compartment model did improve the OFV, but the authors reported that this led to a poorer fit based on the Akaike information criterion. Also, the authors reported that the model did not improve visually, compared to a one-compartment model.

Osimertinib has two active major metabolites, AZ5104 and AZ7550 [18]. AZ5104 concentration in steady state was included in the final model of Model 1 by Brown et al. [6] and Model 2 by Ishikawa et al. [8]. Model 2 also included active metabolite AZ7550; however, since the OSIBOOST subset did not provide AZ7550 concentrations, the AZ7550 metabolite within the popPK of Model 2 could not be taken into account in this evaluation. Model 3 by Rodier et al. [9] and Model 4 by Agema et al. [7] did not take AZ5104 into account in their popPK models.

Model 1 by Brown et al. was developed with samples ranging from pre-dose to 504 h post-dose in 780 patients. Model 2 by Ishikawa et al. was developed based on 302 serum samples from 53 patients with sample times ranging from 2 to 23.6 h post-dose. Model 3 by Rodier et al. was developed based on 420 samples from 87 patients with sample times ranging from 0.33 to 27 h post-dose. Model 4 by Agema et al. was developed based on 819 samples, most of which were trough samples, from 159 patients.

3.2 Model Evaluation Osimertinib

When focusing on the population versus observed concentration plot of the GoF plot in Fig. 1, for all four models there are more concentrations above the line of identity. The distribution of predicted/observed concentrations of Models 1 and 4 follow the line of identity better compared to Models 2 and 3 which are almost horizontally distributed. However, the overall trend of Models 1 and 4 does not follow the line of identity. It must be noted that four distinctive individual concentrations can be seen in Fig. 1, which seem to be separated from the rest of the concentrations. These four distinctive measurements all stem from one individual patient who received an increased osimertinib dose of 160 mg once daily.

Fig. 1
figure 1

Goodness-of-fit plot of population-predicted osimertinib concentrations plotted against observed osimertinib concentrations. µg/l micrograms per liter

Focusing on the individual-predicted GoF plots in Fig. 2, Model 1 shows the best distribution (closest to the line of identity) compared to Models 2–4. As can be expected, the concentration of the individual predicted GoF plots of osimertinib in Fig. 2 are more closely aligned with the line of identity compared to the population-predicted GoF plots of Fig. 1.

Fig. 2
figure 2

Goodness-of-fit plot of individual-predicted osimertinib concentrations plotted against observed osimertinib concentrations. µg/l micrograms per liter

When looking at the CWRES of osimertinib (Fig. 3), all residuals appear to be skewed. More residuals are below the line of identity for TADs of approximately 1 h and 24 h. The residuals of Models 1–3 from approximately TAD 2–10 h are evenly distributed among the line of identity. The residuals of Model 4 from approximately TAD 2–10 h are distributed more above the line of identity compared to below the line of identity. The CWRES of osimertinib plotted against population predictions in Fig. 4 are evenly distributed among the line of identity for Models 1 and 4, when the four distinctive residuals around 400 µg/l are ignored. Again, these four distinctive measurements all stem from one individual patient who received an increased osimertinib dose of 160 mg once daily. For Models 2 and 3 CWRES of osimertinib versus PRED in Fig. , the trend appears to be that the residuals tend to be more above the line of identity at higher population predictions.

Fig. 3
figure 3

Conditional weighted residuals (CWRES) of osimertinib plotted against time after dose (h). CWRES conditionally weighted residual, h hour, µg/l micrograms per liter

Fig. 4
figure 4

Conditional weighted residuals (CWRES) of osimertinib-plotted population predictions. h hour, µg/l micrograms per liter

The median PE and APE for each model are shown in Table S2. Model 4 has the lowest mPE of −0.876%, whereas Model 1 has the lowest mAPE of 7.71%. The boxplots of the PE and APE of the models are presented in Figs. S3 and S4, respectively.

The pcVPCs of all four models show a similar trend in Fig. 5, where no clear distinction between the models can be seen. All observed concentrations fall in the simulated shaded areas, but in the lower region of the simulated areas.

Fig. 5
figure 5

pcVPC of osimertinib versus time after dose (h). The shaded areas represent the 80% CI of the simulated 5th, 50th, and 95th percentile osimertinib concentrations. The black filled and dotted lines represent the 5th, 50th, and 95th percentiles of the observed osimertinib concentrations. The individual points are the observed concentrations from the OSIBOOST cohort. CI confidence interval, pcVPC prediction-corrected visual predictive check

3.3 Model Evaluation Active Metabolite AZ5104

Osimertinib’s active metabolite AZ5104 population GoF plots are shown in Fig. S5. In Models 1 and 2, more concentrations of AZ5104 are above the line of identity. The distribution of concentrations of Model 2 in Fig. S5 are almost along a horizontal line of identity. As expected, the individual-predicted GoF plot of AZ5104 of Models 1 and 2 in Fig. S6 are closer to the line of identity compared to the GoF plot of the population predictions in Fig. S5. The distributions of AZ5104 concentrations of Models1 and 2 are comparably distributed along the line of identity, and no clear difference can be seen. The CWRES of the active metabolite AZ5104 for Models 1 and 2 in Fig. S7 appear to be skewed; more residuals are below the line of identity for the TADs of approximately 1 h and 24 h for both models. The residuals around TAD 6 h are evenly distributed for Model 2, while Model 1 has more concentrations below the line of identity. The residuals of both models around TAD 3 h are evenly along the line of identity. However, Model 2 has more residuals above the line of identity.

4 Discussion

Four popPK models [6,7,8,9] were evaluated with a subset of osimertinib plasma samples from the OSIBOOST clinical trial [14]. Focusing on the population predictions, all four popPK models overpredict the concentrations for the entire concentration range. When excluding the datapoints of the individual who received 160 mg/day, all popPK models still overpredict, meaning that ignoring this patient in our analysis does not affect the conclusion. Furthermore, the CWRES are skewed where all the models are misspecified in the OSIBOOST cohort. Model 1 has the richest dataset and tested more covariates compared to Models 2–4. Also, Model 1 has the best predictive performance compared to Models 2–4 when taking all diagnostic plots into account. However, Model 1 and the other models are not fit to use on a population basis in the OSIBOOST cohort. Nonetheless, all the models are useful for individual predictions in that cohort.

The poor population predictive performance of all four models for the OSIBOOST cohort may be due to several reasons. In general, a different validation/evaluation population compared to the popPK development populations is an important caveat, and can be an important factor for inconclusive results. More importantly, one of the selection criteria in the OSIBOOST clinical trial was that individuals should have an osimertinib Cmin,ss ≤ 195 ng/ml. This means that all individuals in the OSIBOOST cohort have limited exposure to osimertinib, despite receiving a standard (i.e., 80 mg/day), or in one patient, even an increased osimertinib dose (i.e., 160 mg/day). The reasons for relatively low osimertinib exposure in the OSIBOOST cohort are not conclusively reported. Osimertinib is predominantly metabolized by CYP3A4/CYP3A5 [5]. Therefore, the CYP3A4/CYP3A5 genotype can be a reason for the low osimertinib exposure in patients of the OSIBOOST trial. However, information about CYP3A4/CYP3A5 genotype was available for 7 of 11 of the patients in the OSIBOOST trial, all of whom were normal CYP3A4 metabolizers and CYP3A5 non-expressors [14]. Treatment adherence was also closely monitored [14]. Some possible explanations for the low exposure could be a high CYP1A-activity [19], or nonconcordance between the CYP3A genotype and fenotype. However, the clinical relevance of high CYP1A-activity is not known. The low exposure–inclusion criterion introduces an important selection bias compared to the cohorts from the four evaluated models, which is reflected in the population predictions and pcVPC. All observed concentrations in the pcVPC are in the lower simulated areas, indicating that the OSIBOOST cohort is indeed a selection of individuals with a lower exposure to osimertinib compared to the cohort from the four popPK models. The patients in the OSIBOOST trial were selected in a real-world setting, based on their low osimertinib exposure, so the poor population predictive performance of the evaluated popPK models using that data was partly foreseeable.

Model 2 was developed in a cohort of 100% East Asian patients in contrast to the OSIBOOST cohort of 100% Caucasian patients. Model 2 probably does not perform well in the OSIBOOST cohort, since Model 1 found that ethnicity is a significant covariate on the osimertinib clearance.

All four selected popPK models are one-compartment models. Models 2, 3, and 4 had sampling times of up to 27 h after dose, which is less than osimertinib’s half-life (approximately 44 h) [5]. Therefore, the datasets of Models 2, 3, and 4 are arguably not rich enough to properly test if osimertinib is best described by a one- or two-compartment model. Model 1 has a rich dataset with sample times of up to 504 h after dose, and a one-compartment model was still seen as the best approach based on preclinical data.

For individual predictions, all four models can be used to individually predict osimertinib concentrations (for Models 1 and 2, also osimertinibs’ active metabolite AZ5104) in patients with low osimertinib exposure. All four models can be useful to guide TDM for osimertinib in patients with low osimertinib exposure, When population predictions are warranted from the evaluated popPK models, all models have some limitations when used in patients with low osimertinib exposure. Increasing the osimertinib exposure in patients with low osimertinib exposure can theoretically improve responses [14]. Hence, when popPK models overpredict and are used to guide optimal osimertinib exposure, patients are more prone to more modest dose increases, whereas underpredicting popPK models would be less problematic when used to guide optimal osimertinib exposure. Therefore, the overprediction of all popPK models is a complicating factor when the popPK models are used to guide increasing osimertinib exposure in patients with initially low exposure. For population-based predictions in patients with low osimertinib exposure, all models remain suboptimal.

5 Conclusion

All four popPK models can be used to individually predict osimertinib concentrations in patients with low osimertinib exposure. For population predictions, all four popPK models performed poorly in patients with low osimertinib exposure. A novel popPK model with good predictive performance should be developed for the OSIBOOST cohort. Ideally, the cause for the relatively low osimertinib exposure in our evaluation cohort should be known.