Variability Attribution for Automated Model Building

We investigated the possible advantages of using linearization to evaluate models of residual unexplained variability (RUV) for automated model building in a similar fashion to the recently developed method “residual modeling.” Residual modeling, although fast and easy to automate, cannot identify the impact of implementing the needed RUV model on the imprecision of the rest of model parameters. We used six RUV models to be tested with 12 real data examples. Each example was first linearized; then, we assessed the agreement in improvement of fit between the base model and its extended models for linearization and conventional analysis, in comparison to residual modeling performance. Afterward, we compared the estimates of parameters’ variabilities and their uncertainties obtained by linearization to conventional analysis. Linearization accurately identified and quantified the nature and magnitude of RUV model misspecification similar to residual modeling. In addition, linearization identified the direction of change and quantified the magnitude of this change in variability parameters and their uncertainties. This method is implemented in the software package PsN for automated model building/evaluation with continuous data. Electronic supplementary material The online version of this article (10.1208/s12248-019-0310-5) contains supplementary material, which is available to authorized users.


INTRODUCTION
Nonlinear mixed effect (NLME) modeling, commonly known as the population approach, is increasingly used to describe longitudinal data from preclinical/clinical experiments, either to improve the efficiency of the drug development process and subsequent dosing, or increase the understanding of the studied underlying pathophysiological system (1). In contrast to naive pooling approach, which ignores individual differences, and two-stage approach, which does not distinguish between subject and observation variability, NLME models allow pooling of sparse data from different subjects while simultaneously quantifying multiple levels of variability, thanks to its mixed effects nature. In mixed-effects analysis, population parameters are included in a model as fixed effects, and the variability within this population as random effects. Random effects can incorporate variability on both the subject and observation levels, as inter-individual variability (IIV), between occasion variability, between study variability, and residual unexplained variability (RUV). This ability to identify different sources of variability is particularly critical to many clinical applications, e.g., therapeutic drug monitoring.
For highly nonlinear models, extending the structural base model to include covariates or test different models for random effects can be tedious and interrupted by numerical difficulties. These problems increase exponentially with increasing the complexity of the structure, covariate, and variability models. To overcome such computational and time-intensive burden, linear approximation of first-order conditional estimation (FOCE) method was proposed and applied as a diagnostic tool for testing covariates and random effects (2,3). When successfully implemented, linearization substantially reduced runtimes compared to standard NLME models as the fixed effects are not estimated in the linearized models, but fixed to their estimates from the fit of the NLME model. Linearized models were shown to result in similar objective function values (OFVs) to the NLME models, and accurately identify significant covariate relations and stochastic components similar to conventional analysis. Hence, linearization output models in a standardized coding format, linearization was also recommended for automated model building by coupling to other covariate modeling algorithms as stepwise covariate method (SCM) or full random effects covariate modeling (FREM) (4). However, linearized models still need to be estimated given the original observations similar to NLME models, so it might be sensitive to local minima or other estimation-related issues, especially in presence of interactions between empirical Bayes estimates and RUV models. Major deviations between the OFV of the linearized structure base model and its corresponding NLME model should be interpreted as a failure of implementation of linearization and must be solved prior to further investigations using the linearized model. It has not been shown previously that random effects estimated in linearized models or their uncertainties' have similar values if estimated in the corresponding NLME models, which if true, will support the role of linearization in automated model building to predict changes in random variability assigned to model parameters upon the inclusion of a potential covariate or adoption of a new RUV model. Meanwhile, a new method "residual modeling" was proposed as a fast and robust diagnostic tool for assessing RUV models for NLME analysis with continuous outcomes (5). Residual modeling treats the outputted residuals from a NLME model execution as a dependent variable to model its distribution's mean and variance by a linear base model, then this base model is extended to assess different RUV extensions. The improvement in the fit between the residuals base model and its extended versions can accurately identify the nature and magnitude of potential RUV model improvements/ misspecifications, and hence, residual modeling has been already implemented for automated model building. Residual modeling uses a built-in library of six RUV extensions to model the variance of the residuals' distribution from a NLME model execution. The built-in library includes autoregressive (AR1), dynamic transform both sides (dTBS), residuals' IIV, power, tdistribution, and time-varying RUV models (6-10). The investigated residuals were conditional weighted residuals (CWRES), conditional weighted residuals with interaction (CWRESI), individual weighted residuals (IWRES), and normalized prediction distribution errors (NPDE); CWRES outperformed the rest, as CWRES modeling correctly identified the type of the needed RUV model and accurately predicted both the estimates of parameters governing this RUV model and the magnitude of improvement of fit after implementing such RUV model. Residual modeling does not suffer from local minima problems or estimation related issues, as it is using residuals data, not the original observations. This is an advantage for its purpose in fast and robust selection of the best RUV model, but then by definition, it cannot predict the impact of implementing a new RUV model on random variability assigned to the rest of model parameters or their uncertainties.
Here, we investigated if linearization can predict variability attribution for automated model building on the inclusion of a new RUV extension. We used the same six RUV models from our previous work for residual modeling. (5) First, we compared the performance of linearization to residual modeling in selecting the best RUV extension; then, we compared random effects' estimates and uncertainties on linearized models with the different RUV extensions to their corresponding NLME models.

Linearization
For continuous outcome, let y ij be the observation j for individual i, θ is the vector of population parameters, η i is the vector of unexplained deviation of individual parameters θ i from the population parameters θ, x ij is the vector of individual i design components as dose and sampling times, and ε ij is the residual error of observation j for individual i, then the NLME model describing the observations: where f is model prediction, and h is the RUV model to be function of ε ij . Such model can be extended further for multivariate outcome, baseline or time varying covariates. Both random effects η i and ε ij are assumed to follow normal distribution with mean 0 and covariance matrix Ω and Σ, respectively, and the unknown model parameters are estimated by maximum likelihood. According to the way of the dependence of h on f, this NLME model can be linearized based on first-order Taylor expansion around ε ij = 0 and the empirical Bayes estimate η b i : where y ij * is the linearized model, f ij is the approximated individual predictions, and h ij is the approximated individual residual errors.
The NLME model is first evaluated to calculate the different partial derivatives and η b i needed, then y ij * is estimated on the same dataset of the NLME model to obtain η i and ε ij , as these are the only unknown parameters in y ij * . With estimating only random effects alongside its standard coding format, y ij * can be easily and quickly used as a base model for further explaining η i with covariates or using different ε ij models (3). Here, we extended (Eq. 5) to test six RUV models, and compare their goodness of fit, parameters' variability estimates, and uncertainties to conventional testing by NLME models as follow and shown in the Supplementary material.

RUV extensions
To test the dependence of ε ij at time point j on ε ik at time point k, autoregression (AR1) error model with one extra parameter can be implemented: where ρ is the correlation between these errors and t 1/2 is the half-life of ρ. The improvement of fit after implementing AR1 error model in the linearized model (ΔOFV lin, AR1 ) is calculated as the difference in OFV of the linearized base model (Eq. 5) and OFV of the linearized model with AR1 error model (Eqs. 5 and 6). ΔOFV lin, AR1 is comparable to the improvement of fit on implementing AR1 error model in the NLME model (ΔOFV NLME, AR1 ) between the base NLME model (Eq. 1) and its AR1 error model extension.
In presence of skewness in residuals distribution, dynamic transform both sides (dTBS) approach is useful through the estimation of a Box-Cox shape parameter λ and a power term ζ that also address possible scedasticity in residual magnitudes (8,9). Linearized models with dTBS approach follows (Eq. 7) if λ was estimated to 0, and (Eq. 8) otherwise. Improvement of fit on dTBS implementation (ΔOFV lin, dTBS ) is the difference in OFVs of the dTBS linearized model with λ and ζ fixed to 1 and 0, respectively, and the dTBS linearized model with both λ and ζ estimated.
One of maximum likelihood assumptions regarding the residual error ε ij is being identically distributed, this assumption can be relaxed by adding inter-individual variability η i, RUV on the residuals to allow different RUV magnitudes. Improvement of such extension in the fit of the linearized model (ΔOFV lin, IIV ) is the difference in OFVs of (Eq. 5) and (Eq. 9).
In absence of skewness, the dependence of residuals magnitude on model predictions can be corrected with ζ alone in what is known as power RUV model. Improvement of fit on applying the power RUV model to the linearized models (ΔOFV lin, ζ ) is the difference in OFVs of (Eq. 5) and (Eq. 10).
Assuming normal distribution of residuals means that large errors do not exist, which if not true will force maximum likelihood estimation to shift model parameters' estimates to fulfill small errors assumption. This bias can be avoided by introducing t-distributed residuals. The Laplacian method with user-defined conditional likelihood (L) had to be used for a Laplace linearized base model (Eq. 11) and linearized model with t-distributed residuals (Eq. 12), where σ is the square root of h ij , and υ is the degrees of freedom; the difference of these models' OFVs is ΔOFV lin, υ .
Lastly, time-varying errors allow different error magnitudes for different time points. A typical example is that the absorption phase in a pharmacokinetic model can have larger errors than the elimination phase. This is implemented by allowing the change of the standard deviation of residuals to be a step function of the time or time after dose, at selected cutoff time point X.
where θ 1 is the standard deviation of residuals before the cutoff time point X, θ 2 is the standard deviation of residuals after this cutoff time point, and Σ is fixed to 1, as multiplying a random variable by constant (ω • ε ij ) increase the variance by the square of this constant (ω 2 ). We used three cutoff points to divide the data into four equal sized groups, and the improvement of fit (ΔOFV lin, time ) after extending the linearized base model (Eqs. 4 and 5) to the linearized model with time varying residuals (Eqs. 5 and 13) is the difference between their respective OFVs.

Evaluations
These extended linearized models (example code in Supplementary material) were estimated to obtain their respective improvement of fit ΔOFV lin , as well as Ωs' estimates and uncertainties. We compared the performance of ΔOFV lin in predicting ΔOFV NLME to that of ΔOFV Diagnostic obtained by residual modeling, where diagnostic refers to the used residual. Afterwards, we compared Ωs' estimates and uncertainties of linearized models to their respective NLME models as shown in Fig. 1. We used 12 real data examples for our evaluation (Table I).
Only when the linearized base model and the NLME base model had similar OFV were RUV extensions added and further estimated. All real data examples were treated as continuous. Asenapine effects were assessed using PANSS, which is a composite score, where items of positive, negative, and general nature are scored and combined into one assessment. Despite this, the asenapine data was treated as continuous data in the model. Also, the asenapine model was implemented with residuals' IIV model from the start. Models varied in structure components from simple pharmacokinetic one compartment model as moxonodine, to complex description of nonlinear system of interacting multidependent variables as the integrated glucose-insulin (IGI) model. Seven models used log-transformed data. Two models used a combined error model, two models used a proportional error model and the remaining models used additive error models. NONMEM version 7.4.3 (ICON Development Solutions, Hanover, MD, USA) was used for the analysis (22), with the aid of the linearize tool in PsN (3,23), and graphs were generated in R (24). To obtain the improvement of fit by residual modeling (ΔOFV Diagnostic ) when testing the different RUV extensions on the real data examples, we used the resmod tool in PsN (5).

RESULTS
Linearization was successfully applied to all examples, justified by the similarities in the OFVs of the linearized base models and the NLME base models. All examples were extended successfully to the different RUV models, except for AR1 and t-distribution error models with Clomethiazole and the IGI models. All examples benefitted significantly with one or more of the RUV extensions, except for Daunorubicin model. Across all examples, the agreement between ΔOFV lin and ΔOFV NLME was good as shown in Fig. 2. Comparing to the performance of residual modeling in predicting ΔOFV NLME , linearization surpassed CWRESI, IWRES, and NPDE over all different ranges of ΔOFV NLME , and performed better than CWRES at most ranges of ΔOFV NLME except at low ranges of ΔOFV NLME (~10) where CWRES was slightly better.  Regarding the estimates of Ωs on linearized models (Ω lin ) and their respective estimates on NLME models (Ω NLME ), they showed good agreement with only one outlier: AUC50's variability in asenapine model. A plot of log (Ω NLME ) versus log (Ω lin ) across all examples with base models and their RUV extensions is shown in Fig. 3, with estimates less than − 4 on log scale excluded from the graph, given that these estimates are low and would not be considered in further model development. In total, nine estimates were excluded based on this, e.g., the variability assigned to the intercompartmental clearance in Clomethiazole model under IIV on RUV and dTBS extensions. Standard errors (SEs) of each Ω lin and its respective Ω NLME showed a good agreement in the commonly expected range of SEs for a well identifiable continuous data variability parameter (0-1), and bad agreement at the extreme estimates of SE(Ω NLME ), for instance, the SE of PAN0's variability in asenapine model was > 1000 with both dTBS and power RUV extensions, which is unacceptable. This may be related to the scores used to measure asenapine effect, i.e., PANSS. A plot of the logtransformed estimates of SE(Ω NLME ) and SE(Ω lin ) is presented in Fig. 4, with estimates less than − 4 on log scale excluded from the graph. Lastly, relative standard errors (RSEs) for each Ω lin and its corresponding Ω NLME were calculated on the standard deviation scale as (Eq. 14), and their log-transformed estimates are presented in Fig. 5, that in addition to showing the same trends as Fig. 4, showed that standard errors after implementing t-distribution extensions are less predictable by linearization than the other RUV extensions.

DISCUSSION
In this paper, we explored if the use of linearization to identify and quantify RUV model misspecifications, similar to residual modeling (5), can provide additional advantages. Residual modeling assesses whether RUV extensions are required to address an RUV misspecification. It is done in an extremely fast and robust way, thanks to the simple nature of models for residuals data. In case of multiple dependent variables, residual modeling evaluates the RUV extensions separately for each dependent variable, identifying which variable need which extension, and so reducing the risk of ending up with an over-parameterized NLME model. However, being estimated on residual data has shortcomings, as residual modeling cannot inform on the rest of the NLME model parameters. Implementation of a needed RUV extension in a NLME model would be expected to improve the uncertainties of Ω and θ subsequently, as the latter is a function of the former. Linearization, in contrast to residual modeling, uses the calculated parameters' partial derivatives with respect to η b i from the fit of the NLME model. It estimates the RUV model incorporating any extension and the random effects components given the same data as the NLME model. Thus, linearization can estimate explicitly the random effects and their uncertainties in the base and the extended model, and implicitly the magnitude and the direction of change in the random effects and their uncertainties, and that is what we had shown here.
We successfully implemented six RUV extensions to the standardized linearization framework and linearized all real data examples. However, estimation difficulties were present when applying AR1 and t-distribution RUV extensions to the N LM E/lineari zed models of Clomethiazole and the IGI, but not in their respective residual modeling. The agreement between ΔOFV lin and ΔOFV NLME when improvement of fit is > 10 was nearly perfect, indicating that only the estimates of random effects were changing on implementing the different RUV extensions in the NLME models. Deviations would be expected if estimates of fixed effects were also changing. The overall prediction performance of ΔOFV NLME by ΔOFV lin was better than ΔOFV CWRES , however not by much.   Ethambutol and Disufenton sodium models and could not identify t-distribution as a significant extension in Asenapine model. Even though it is a minor difference, it illustrated the high sensitivity of linearization to detect differences between the RUV extensions that introduce similar flexibility in the model, e.g., IIV and t-distribution RUV models both offer outlier robustness in the NLME model. This showed that conclusions drawn from the results of automated testing of RUV extensions will remain the same on replacement of residual modeling with linearization, and the only expected difference would be an increased run time for linearization of structure models with large random effects models.
Regarding the prediction of the impact of RUV extensions on Ω NLME , linearization showed a good ability with only one outlier (the Ω assigned to AUC50 in A s e n a p i n e m o d e l ) . I n t e r e s t i n g l y, l i n e a r i z a t i o n underestimated this Ω with all RUV extensions except tdistribution. This underestimation issue will escalate when it comes to predicting SE(Ω NLME ). Linearization did well in assessing the expected ranges of uncertainties of variability assigned to model parameters describing continuous data; more deviations occurred as uncertainties' estimates moved away from that range, with the main problem being Asenapine model. This might point out that violation of assumptions regarding the nature of data will be a problematic in this automated testing procedure as Asenapine effects are measured using PANSS, which is a composite score, but treated as continuous data in the model. Among the RUV extensions, t-distribution was the most associated with deviations, mainly underestimation. That can be easily tracked down to the use of LAPLACE method commonly known for minimization-related problems, for instance, SE(Ω NLME, υ ) of AUC50 in Asenapine model was 1.66 × 10 −4 which is too close to 0, and an unreasonable estimate for uncertainty, given that the estimate of Ω NLME, υ of AUC50 is 2.9. This problem of unreasonable estimates in SE(Ω NLME ) explain all the extreme deviations seen in Fig.  4. One of these is PAN0 parameter in asenapine model which Ω NLME, dTBS estimate was 168, but its SE(Ω NLME, dTBS ) was 4.15 × 10 4 . With these deviations being justified, it is safe to claim that linearization itself or its predictive performance of SE(Ω NLME ) showed no built-in drawbacks. The same issues go for RSE(Ω NLME ) as not respecting the nature of the data, t-distribution extension, and the unrealistic estimates of Ω NLME and their uncertainties propagated to most of the outliers in Fig. 5.
In conclusion, we investigated the possible merits of linearization if used to evaluate RUV models for continuous data. Linearization accurately identified the nature of RUV extension if needed and predicted the improvement of fit on its inclusion similar to residual modeling. In addition, linearization can predict the impact of including such RUV extension on the variability assigned to model parameters and their uncertainties, allowing its utilization for variability attribution with automated model building procedures.

MMA Ibrahim acknowledges the Egyptian Ministry of
Higher Education for financial support.

COMPLIANCE WITH ETHICAL STANDARDS
Conflict of Interest/Disclosure There is no involvement, financial or otherwise, that might potentially bias this work.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

PUBLISHER'S NOTE
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.