Ethical considerations
The WAPPS user agreement allows reuse of the data for modelling and other research purposes, as described in the WAPPS study protocols, approved by the HIREB at McMaster University and registered in clinicaltrial.gov (NCT02061072, NCT03533504).
Data for model development
Data input into the WAPPS-Hemo platform by clinicians contains individual information relevant for modeling including, but not limited to, dose and duration of infusion; anthropometric data corresponding to body weight (BW), age and height (HT); endogenous (baseline) FVIII activity; measurement assay used (one-stage vs. chromogenic); timing and measured plasma FVIII activity of blood samples.
PK observations from hemophilia A patients receiving an infusion of Fanhdi® or Alphanate® were extracted from the WAPPS-Hemo database on February 16th, 2018. Patients with a history of inhibitors were included, but not those with current inhibitors. Only one occasion per patient was included in the dataset.
HT was not a mandatory covariate in previous versions of WAPPS, and was missing for some patients. When HT was missing, its value was extrapolated from the multilinear regression with BW and age and imputed.
PopPK model development
The PopPK analysis was performed using non-linear mixed effects modelling as implemented in NONMEM and PDxPop (v7.3 and v5.2, respectively; ICON Development Solutions, Ellicott City, MD, USA). Estimation of the parameters was performed using Laplacian option implemented in NONMEM. Graphical analysis was conducted in MATLAB (R2017b, Mathworks, Natick, MA, USA).
As a first step, observed PK data was assessed as following a 1-, 2- or 3-compartment model following an IV infusion and incorporating any residual FVIII from a previous infusion (predose) and endogenous FVIII activity [9]. Equation 1 provides an example of the time profile activity (C(t)) following a 2-compartment model.
$$C\left( t \right) = Ae^{ - \alpha t} + Be^{ - \beta t} + endogenous\;FVIII + (predose - endogenous)e^{ - \beta t}$$
(1)
Endogenous FVIII was modeled as the value entered by care centers or 0.005 IU/mL when not provided (n = 1 patient only in the evaluation dataset, for which we imputed the value of the most common lower limit of quantification—LLOQ—divided by 2; however LLOQ as low as 0.004 IU/mL are sometimes indicated by centers). Residual FVIII activity was calculated as observed predose activity minus endogenous level. This amount decayed with a rate equal to the terminal decay rate of the compartment model [20].
As the LLOQ is higher than the endogenous factor level in severe hemophilia A patients, samples below LLOQ (BLQ) can be observed. BLQ observations were considered as censored values and handled using the M3 method [21].
Variability in PK parameters (e.g. clearance, volume…) was described as between-subject variability (BSV) using an exponential function [9]. Error on the observations was modeled as residual unexplained variability (RUV) and was tested as additive, proportional and combined error [9].
As a second step, covariate analysis was performed. Covariate relationships were assessed graphically and explored by stepwise forward inclusion (dOFV > 3.84, p < 0.05) and backward elimination (dOFV > 6.63, p < 0.01) [22]. Body weight (BW), height (HT), age and fat-free mass (FFM) were explored as covariates and normalized to their population median values (\(cov_{med}\)) to perform the analysis. BW, HT and FFM were tested on each PK parameter (P) using the following equation for any subject i:
$$TVP_{i} = P_{pop} \left( {\frac{{cov_{i} }}{{cov_{med} }}} \right)^{\theta }$$
(2)
where \(TVP_{i}\) is the subject PK parameter typical value, \(cov_{i}\) his covariate value. \(P_{pop}\) represents the PK parameter typical value for the median subject, and \(\theta\) a scale factor of the covariate effect on the PK parameter.
The age relationship was modeled as the most significant of linear (Eq. 3) or piecewise linear models (Eq. 4). In the piecewise linear function, the breakpoint was fixed as the median age value of the population, meaning that the typical value was constant for subjects younger than the median age and proportional to age for subjects older than median age (or inversely proportional if \(\theta < 0\)).
$$TVP_{i} = P_{pop} \left( {1 + \theta_{Age} (Age_{i} - Age_{med} )/Age_{med} } \right)$$
(3)
$$TVP_{i} = P_{pop} \left( {1 + \theta_{Age} max(0, Age_{i} - Age_{med} )/Age_{med} } \right)$$
(4)
If two covariates were correlated, only the most significant covariate was kept.
Selection between comparable intermediate models was primarily performed using the objective function value (OFV) and the likelihood ratio test; addition of one parameter to a model was considered significantly better if the OFV decreased by 3.84 or more corresponding to p < 0.05 [9]. To complement the selection of the model, diagnostic plots were used to assess the goodness of fit and parameters distributions, especially, the shrinkage of these parameters [23]. If shrinkage of any BSV parameter was higher than 35%, the model was considered over-parameterized and the BSV term was removed. Standard error and confidence intervals on the parameters of selected models were assessed by bootstrap analysis. Bootstrap analysis was performed on 1000 runs by random sampling with replacement accounting for age stratification of the dataset.
PopPK model evaluation
Prediction-corrected visual predictive check (pcVPC) is a diagnostic tool comparing FVIII activity simulated by the model with observations by plotting percentiles of the observations and simulations vs time [24]. Since the response profile is dependent on dose and covariates, the observations and simulations are normalized by the population predictions of the model allowing a better evaluation of the model. pcVPC was performed by replicating 500 simulations.
Tenfold cross validation was performed to evaluate the ability of the model to predict new data by splitting the data into a learning dataset, used for re-estimating the parameters of the model, and a validation dataset, used for evaluating the model Bayesian predictions. The evaluation consisted in calculating the relative error (\(RE_{i}\)) of each individual prediction of the new estimated model (\(Pred_{i}\)—derived from the sub dataset) to the predictions obtained using the original model (\(Pred_{i}^{0}\)—derived from the complete dataset). For every subject i in the evaluation dataset: \(RE_{i} = 100\frac{{\left( {Pred_{i} - Pred_{i}^{0} } \right)}}{{Pred_{i}^{0} }}\). The evaluation was repeated 100 times using a random split of the dataset at every iteration. Median and 95th percentile of the absolute value of the relative errors were then computed for clearance (CL) and central volume (V1), as well as for half-life and time spent above a 0.02 IU/mL threshold (TAT2) that were individually derived from the predicted PK parameters. Derivation of TAT2 was obtained by simulating the PK profile using the individual PK parameters, baseline and dose information, then calculating for which time point FVIII was higher than 0.02 IU/mL.
Limited sampling analysis (LSA) evaluates the precision and bias of the model as a function of the number and timing of observations. More specifically, LSA assesses the model robustness in a sparse sampling environment and was performed as described in Brekkan et al. [16]. A virtual dataset was created using the same distribution of demographics and PK as in the final PopPK model. FVIII activity in 1000 virtual subjects receiving 50 IU/kg every Monday-Wednesday-Friday was simulated over 4 weeks. Factor VIII activity from the last Friday dose was used for the analysis. One sample was taken 30 min before and 9 samples were taken after that infusion (at 1, 3, 6, 12, 24, 30, 48, 54, and 72 h). Bayesian predictions of CL and V1, and derived half-life and TAT2, between sparse sampling designs accounting for 2 and 3 observations were compared for precision and bias to the full sampling design.
External evaluation
New data extracted from WAPPS-Hemo on September 14th, 2018 were used to perform an external evaluation to determine whether the model we derived produced PK outcomes on new patients that were similar to those in the development dataset. Bayesian forecasting was performed to estimate CL, V1, and derive half-life, TAT2 as well as the concentration–time profile for every subject. This evaluation aims to ensure that when the model is used to predict PK profiles in new patients, it does not produce erroneous results.
To assess if this model, built using routine clinical care data, produced similar outcomes as compared to a generic PopPK model for plasma derived FVIII currently used on WAPPS-Hemo [25], Bayesian forecasting was completed with both models for the 49 patients and the outcomes compared by coefficient of determination (R2). The generic model was derived using 2760 observations from 310 patients (n = 7 brands) who underwent dense data PK as part of industry and investigator-initiated research projects. Specific covariates are included for plasma derived concentrates accounting for 14 subjects administered with Emoclot and 35 subjects administered with Octanate. This evaluation aims to assess if a plasma-derived FVIII model built using real-world data produces similar outcomes on new patients as compared to a plasma-derived FVIII model built using clinical trial data.