Comparison of covariate selection methods with correlated covariates: prior information versus data information, or a mixture of both?

The inclusion of covariates in population models during drug development is a key step to understanding drug variability and support dosage regimen proposal, but high correlation among covariates often complicates the identification of the true covariate. We compared three covariate selection methods balancing data information and prior knowledge: (1) full fixed effect modelling (FFEM), with covariate selection prior to data analysis, (2) simplified stepwise covariate modelling (sSCM), data driven selection only, and (3) Prior-Adjusted Covariate Selection (PACS) mixing both. PACS penalizes the a priori less likely covariate model by adding to its objective function value (OFV) a prior probability-derived constant: \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2*{\kern 1pt} \,{\ln}\left( {{\Pr}\left( X \right)/\left( {1 - {\Pr}\left( X \right)} \right)} \right)$$\end{document}2∗lnPrX/1-PrX, Pr(X) being the probability of the more likely covariate. Simulations were performed to compare their external performance (average OFV in a validation dataset of 10,000 subjects) in selecting the true covariate between two highly correlated covariates: 0.5, 0.7, or 0.9, after a training step on datasets of 12, 25 or 100 subjects (increasing power). With low power data no method was superior, except FFEM when associated with highly correlated covariates (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=0.9$$\end{document}r=0.9), sSCM and PACS suffering both from selection bias. For high power data, PACS and sSCM performed similarly, both superior to FFEM. PACS is an alternative for covariate selection considering both the expected power to identify an anticipated covariate relation and the probability of prior information being correct. A proposed strategy is to use FFEM whenever the expected power to distinguish between contending models is < 80%, PACS when > 80% but < 100%, and SCM when the expected power is 100%. Electronic supplementary material The online version of this article (10.1007/s10928-020-09700-5) contains supplementary material, which is available to authorized users.


Selection of the covariate modelling strategy
For the pharmacokinetic model of the moxonidine concentration-time data in congestive heart failure patients (Karlsson et al. 1998), there are two covariates of particular interest to describe the variability of clearance: weight (WT) and creatinine clearance (CRCL). These two covariates are highly correlated (r=0.69). The description of the data is provided above in the Data section.
The decision of the covariate selection method to use (FFEM, sSCM, or PACS) depends on the expected power of the data to (1) select a covariate versus no covariate, and (2) discriminate between two correlated covariates. As discussed in the paper, FFEM would be more appropriate for a low power case, but for data with higher power sSCM or PACS would be more interesting and allow usage of the data information.
In order to select the more appropriate covariate method for the moxonidine data, power was computed via a model-based simulation approach, comparing the pharmacokinetic model without covariates M 0 and the two covariate models M CRCL and M W T . These covariate models were based on prior expectations and the variability in CL originating from the covariate component was similar for the two sets of simulation.
• M CRCL : linear model with creatinine clearance assuming 50% of the elimination is renal for the typical patient. • M W T : power model, with the allometric exponent set to 0.75.
Stochastic simulations (n = 1000) were performed with each one of the two covariate models, and estimated with all three models (no covariate, CRCL and WT), including estimation of the covariate-related parameters. As shown in the table below, the power is high, both to select each covariate model over no covariate, and to select the true covariate (used in the simulation) over the correlated one. The power was computed with the LRT when the models were nested (α = 0.05), and with the AIC when not. With such power to detect the assumed covariate relationships (>80% but <100%), PACS is selected. Knowing that moxonidine is to a large extent renally cleared, creatinine clearance will be favoured over weight. PACS with a prior probability of 0.75 favouring CRCL over WT results in a modelling strategy which accounts for the fact that the data are largely expected to discriminate between the two covariates, but additionally incorporating the prior expectation of CLCR being more likely. Numerically, a prior of 0.75 translates to an OFV penalty of 2 * ln(0.75/0.25) = 2.19 for the WT model.

Models
Following the fit of the three models to the patient data, the PACS strategy identified the model incorporating CRCL as a covariate on moxonidine CL superior over the alternative models. The difference in OFV between the no covariate and covariate models were similar to that expected from the simulations.

Cross-validation
To estimate the predictive performances of the three models (M 0 , M CRCL , M W T ) a ten-fold cross-validation was performed (Ribbing et al. 2007). Ten training data sets were constructed by pooling nine of the ten parts, and the predictive performances assessed on the remaining part, the validation data set. The OFV of the ten validation datasets were averaged to obtain the cross-validation OFV.

Discussion
The simulations indicated that the power to discriminate against no covariate was high, ca 98%, and it was still high but slightly lower for discrimination between the two covariates, ca 93%. In this situation, the prior knowledge of the renal route being important is allowed to weight in on the otherwise data driven selection. The prior value of 0.75 allow us to specify a preference, but not complete reliance, of CRCL as the better model. In this case, the simulations performed provided a good assessment of the discriminatory power of the data with respect to the covariate association with clearance. This is evidenced by similar differences in OFV in the simulations as in the real data set. The cross-validation results indicate that the CRCL model has the best predictive performance with respect to OFV.