Skip to main content
Log in

Uncertainty Computation at Finite Distance in Nonlinear Mixed Effects Models—a New Method Based on Metropolis-Hastings Algorithm

  • Research Article
  • Published:
The AAPS Journal Aims and scope Submit manuscript

Abstract

The standard errors (SE) of the maximum likelihood estimates (MLE) of the population parameter vector in nonlinear mixed effect models (NLMEM) are usually estimated using the inverse of the Fisher information matrix (FIM). However, at a finite distance, i.e. far from the asymptotic, the FIM can underestimate the SE of NLMEM parameters. Alternatively, the standard deviation of the posterior distribution, obtained in Stan via the Hamiltonian Monte Carlo algorithm, has been shown to be a proxy for the SE, since, under some regularity conditions on the prior, the limiting distributions of the MLE and of the maximum a posterior estimator in a Bayesian framework are equivalent. In this work, we develop a similar method using the Metropolis-Hastings (MH) algorithm in parallel to the stochastic approximation expectation maximisation (SAEM) algorithm, implemented in the saemix R package. We assess this method on different simulation scenarios and data from a real case study, comparing it to other SE computation methods. The simulation study shows that our method improves the results obtained with frequentist methods at finite distance. However, it performed poorly in a scenario with the high variability and correlations observed in the real case study, stressing the need for calibration.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Data Availability

The real data used in this article comes from a clinical trial conducted by F Hoffmann-La Roche and is not publicly available. Requests for access should be made to François Mercier.

References

  1. Bauer RJ. NONMEM Tutorial Part I: description of commands and options, with simple examples of population analysis. CPT: Pharmacometrics Syst Pharmacol. 2019;8(8):525–37.

    CAS  PubMed  Google Scholar 

  2. Bauer RJ. NONMEM Tutorial Part II: estimation methods and advanced examples. CPT: Pharmacometrics Syst Pharmacol. 2019;8(8):538–56.

  3. Lavielle M. Mixed effects models for the population approach: models, tasks, methods and tools. Chapman and Hall/CRC; 2014.

  4. Comets E, Lavenu A, Lavielle M. Parameter estimation in nonlinear mixed effect models using saemix, an R implementation of the SAEM algorithm. J Stat Softw. 2017;80:1–41.

    Article  Google Scholar 

  5. Dubois A, Lavielle M, Gsteiger S, Pigeolet E, Mentré F. Model-based analyses of bioequivalence crossover trials using the stochastic approximation expectation maximisation algorithm. Stat Med. 2011;30(21):2582–600.

    Article  PubMed  Google Scholar 

  6. Loingeville F, Bertrand J, Nguyen T, Sharan S, Feng K, Sun W, et al. New model-based bioequivalence statistical approaches for pharmacokinetic studies with sparse sampling. AAPS J. 2020;22(6):141.

  7. Bertrand J, Comets E, Chenel M, Mentré F. Some alternatives to asymptotic tests for the analysis of pharmacogenetic data using nonlinear mixed effects models. Biometrics. 2012;68(1):146–55.

  8. Thai H, Mentré F, Holford N, Veyrat-Follet C, Comets E. Evaluation of bootstrap methods for estimating uncertainty of parameters in nonlinear mixed-effects models: a simulation study in population pharmacokinetics. J Pharmacokinet Pharmacodyn. 2013;41(1):15–33.

    Article  PubMed  Google Scholar 

  9. Dosne A, Bergstrand M, Harling K, Karlsson M. Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance. J Pharmacokinet Pharmacodyn. 2016;43(6):583–96.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Ueckert S, Riviére M, Mentré F. Alternative to resampling methods in maximum likelihood estimation for NLMEM by borrowing from Bayesian methodology. 2015. p 24. https://www.page-meeting.org/?abstract=3632.

  11. Vaart A. Asymptotic statistics (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge University Press; 1998.

  12. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, et al. Stan : a probabilistic programming language. J Stat Softw. 2017;76(1).

  13. Guhl M, Mercier F, Hofmann C, Sharan S, Donnelly M, Feng K, et al. Impact of model misspecification on model-based tests in PK studies with parallel design: real case and simulation studies. J Pharmacokinet Pharmacodyn. 2022;49(5):557–77.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Delyon B, Lavielle M, Moulines E. Convergence of a stochastic approximation version of EM algorithm. Ann Stat. 1999;27(1):94–128.

    Article  Google Scholar 

  15. Panhard X, Mentré F. Evaluation by simulation of tests based on non-linear mixed-effects models in pharmacokinetic interaction and bioequivalence cross-over trials: evaluation of tests based on NLMEM. Stat Med. 2005;24(10):1509–24.

    Article  PubMed  Google Scholar 

  16. Gelman A, Gilks W, Roberts G. Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab. 1997;7(1).

  17. Morris T, White I, Crowther M. Using simulation studies to evaluate statistical methods. Stat Med. 2019;38(11):2074–102.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Neal R. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1. Department of Computer Science, University of Toronto; 1993.

  19. Bertrand J, Comets E, Laffont C, Chenel M, Mentré F. Pharmacogenetics and population pharmacokinetics: impact of the design on three tests using the SAEM algorithm. J Pharmacokinet Pharmacodyn. 2009;36:317–39.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Dr. Maud Delattre for her valuable input on the methods of this article, and Hervé Le Nagard, Lionel de la Tribouille and Rémy Bertino for the use of the CATIBioMed calculus facility. The illustrative example data were obtained from studies sponsored by F Hoffmann-La Roche. We thank the participants and investigators who participated in these studies.

Author information

Authors and Affiliations

Authors

Contributions

MG, JB and EC designed the study. MG and LF performed the analyses. MG wrote the article. JB and EC critically revised it. FM provided the data and approved the final version of the article.

Corresponding author

Correspondence to Mélanie Guhl.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

SAEM

 

  • The probability \(p(y_i | \psi _i)\) of the observations \(y_i = \{ y_{ij} \}\) for subject i is supposed to depend on a known distribution (mostly Gaussian) and on the individual parameters \(\psi _i\)

  • Modelling of the joint distribution \(p(y, \psi , \theta ; A)\) (for several subjects)

    $$\begin{aligned} p(y, \psi ,\theta ;A)&= p(y | \psi , \theta ) \; p(\psi | \theta ) \; p(\theta | A) \end{aligned}$$
    (10)
    • \(\theta =(M,B,\Omega )\) is the population parameter vector

    • \(\psi \) is the individual parameter matrix

    • A is the (potential) hyperparameter vector of the prior on \(\theta \)

The EM algorithm is a two-step iterative method developed to compute the likelihood in case of missing data (here, the individual parameters which are not observed):

  • Computation of the conditional expectation of the loglikelihood l of the complete data \(u=(y,\psi )\) knowing the incomplete data y and the current estimation \(\theta ^k\) (E):

    $$\begin{aligned} Q(y;\theta |\theta ^k) = \mathbb {E}(l(\theta ;u)|y;\theta ^k)&= \int l(\theta ;u)p(\psi |y;\theta ^k)d\psi \end{aligned}$$
    (11)
  • Maximisation of the conditional expectation of the loglikelihood l of the complete data (M):

    $$\begin{aligned} \theta ^{k+1}&= \textrm{Arg}\max _\theta Q(y;\theta |\theta ^k) \end{aligned}$$
    (12)

In the SAEM algorithm, the step E is approximated by simulating a single value in the conditional law of the complete data \(p(u|y;\theta ^k)\) at each step, which is considerably faster than a stochastic integration.

At iteration k of SAEM:

  • Simulation step: draw \(\psi _k\) from the conditional distribution \(p(\cdot |y;\theta _k)\).

    • because we do not have access to the conditional distribution, the simulation-step is replaced at iteration k by m iterations of the Metropolis-Hastings algorithm

    • in default algorithm, 2 iterations of each of 3 successive kernels

  • Stochastic approximation: update conditional loglikelihood l at iteration k, \(l_k(\theta )\), according to:

    $$\begin{aligned} l_k(\theta )&= l_{k-1}(\theta ) + \gamma _k ( \log p(y,\psi _k;\theta ) - l_{k-1}(\theta ) ) \end{aligned}$$
    (13)

    where \((\gamma _k)\) is a decreasing sequence of positive numbers such that \(\gamma _1=1\), \( \sum _{k=1}^{\infty } \gamma _k = \infty \) and \(\sum _{k=1}^{\infty } \gamma _k^2 < \infty \).

  • Maximisation step: update \(\theta _k\) as

    $$\begin{aligned} \theta _k&= \textrm{Arg}\max _\theta l_k(\theta ) \end{aligned}$$
    (14)

The first \(K_1\) iterations of SAEM are the exploratory phase, and the \(K_2\) following iterations of SAEM are the smoothing phase.

MH algorithm in simulation step:

  • for \(i=1,2,\ldots ,N\) and denoting \(\psi _{i,0}=\psi _{i}^{(k-1)}\), for \(p=1,2,\ldots ,m\), the MH algorithm consists in:

    1. 1.

      draw \( \tilde{\psi }_{i,p}\) using the proposal kernel \( q_{\theta _k}( \psi _{i,p-1},\cdot ) \)

    2. 2.

      set \( \psi _{i,p} = \tilde{\psi }_{i,p} \) with probability

      $$\begin{aligned} \alpha ( \psi _{i,p-1} , \tilde{\psi }_{i,p} ) = \min \left( 1, \frac{ p( \tilde{\psi }_{i,p} |y_i;\theta _k) q_{\theta _k}(\tilde{\psi }_{i,p} , \psi _{i,p-1} )}{ p( \psi _{i,p-1} |y_i;\theta _k) q_{\theta _k}(\psi _{i,p-1} ,\tilde{\psi }_{i,p} )} \right) \end{aligned}$$
      (15)
    3. 3.

      set \(\psi _{i,p} = \psi _{i,p-1}\) with probability \(1- \alpha ( \psi _{i,p-1},\) \(\tilde{\psi }_{i,p} ) \).

  • let \(\psi _i^{(k)} = \psi _{i,m}\)

In both Monolix (3) and saemix (4), 2 iterations of 3 successive transition kernels are used.

Maximum Likelihood Estimator of the Population Parameters \(\theta \): \(\theta \) is not considered a random variable but as a fixed parameter.

$$\begin{aligned} \hat{\theta }_{ML}&= \textrm{Arg}\max _{\theta } Q(y;\theta ) \end{aligned}$$
(16)
Fig. 4
figure 4

Zip plot obtained with Asympt on the rich design

Fig. 5
figure 5

Zip plot obtained with SIR on the rich design

Fig. 6
figure 6

Zip plot obtained with Post on the rich design

Fig. 7
figure 7

Zip plot obtained with MH on the rich design

Fig. 8
figure 8

Zip plot obtained with MH—variance inflated by 1.5—on the rich design

Fig. 9
figure 9

Zip plot obtained with MH—variance inflated by 2—on the rich design

Fig. 10
figure 10

Zip plot obtained with Asympt on the sparse design

Fig. 11
figure 11

Zip plot obtained with SIR on the sparse design

Fig. 12
figure 12

Zip plot obtained with Post on the sparse design

Fig. 13
figure 13

Zip plot obtained with MH on the sparse design

Fig. 14
figure 14

Zip plot obtained with MH—variance inflated by 1.5—on the sparse design

Fig. 15
figure 15

Zip plot obtained with MH—variance inflated by 2—on the sparse design

Simulation Study—Zip Plots

Zip plots are a way to visualise why coverage is controlled or not in a simulation study by displaying all confidence interval (CI) estimates (17). For any estimated parameter, confidence intervals are displayed for each simulated dataset, ranked according to the ratio of bias over SE. Ranks are then plotted against the confidence intervals, emphasizing in grey those who do not cover the simulated value. A horizontal line represents the 0.95 target coverage rate.

This kind of plot allows to show bias (when the deviation of CI from the true value is heavier on one side) and SE over/underestimation (when the grey CI not covering the true value are below/above the target value) at the same time.

Fig. 16
figure 16

Comparaison of RSE obtained for the different model parameters with stan () and HMC () methods on 100 rich (X) and sparse (O) datasets

Simulation Study—Comparison of Post with a Full Bayesian Method

We compared the RSE obtained with Post and a full Bayesian approach as implemented through the HMC algorithm in Stan (12), hereafter called HMC, on a subset of 100 datasets.

HMC was run using Stan (12). We ran three chains of 11,000 iterations (including 1000 warm-up iterations). The prior distribution was the same as for Post.

The initial values were the true values on the rich data, and lower \(\Omega \) values on the sparse datasets to help the chains converge. Even with that help, 23% of the datasets gave too high \(\hat{R}\) for the results to be used, which suggests that the HMC algorithm is sensitive to initial values and that in the Post method, initial values taken from the saemix fit helped get better performances.

In Fig. 16, we compare the results obtained with a full Bayesian algorithm (method called HMC) to those obtained with Post and show that they are very similar.

Tables of RSE on Gantenerumab Real Data

In Tables II and  III, we present the results shown on the star plots of Fig. 3 with an additional column showing that once again the results obtained with a full Bayesian algorithm (method called HMC) are very similar to the Post results.

Table II Relative Standard Errors Computed for the Parameters of the Model Fitted on the Full Dataset (N=48) with Asympt, SIR, Post, HMC and SAEM_MH Methods
Table III Relative Standard Errors Computed for the Parameters of the Model Fitted on the Sparse Subset of the Data (N=12) with Asympt, SIR, Post, HMC and SAEM_MH Methods

Extension of the Simulation Study

Fig. 17
figure 17

95% coverage rates obtained with Asympt (), SIR (), Post (), SAEM_MH with inflation factor of the variance kernel at 1 ()

Following the real case study, we extended our simulation study with more challenging features encountered in the application, i.e. higher variances and correlation in the random effects. We simulated 100 sparse datasets with the same settings as the first part of the simulation study except for the inter-individual variability matrix \(\Omega \): \(\omega _{ka}=\omega _{CL}=\omega _{V}=1.1\) and \(\rho _{ka,CL}=0.9\), \(\rho _{ka,V}=0.98\), \(\rho _{CL,V}=0.9\). Figure 17 shows the coverage rates obtained on this extension of the simulation study. Asympt gave more accurate coverage rates because the high correlations between parameters compensated for the sparsity of data, although uncertainty was still underestimated, as it was with SIR. Post method coverage rates were below the target for all parameters although \(\hat{R}\) were not particularly higher and the proportion of datasets failing to converge was stable. SAEM_MH acceptation rates were very low (\(2.5\%\)), indicating that the method was unable to sample sufficiently to assess the uncertainty. Indeed all coverage rates were below the target.

Of note, five datasets could not be used for the Post method due to \(\hat{R}\) being too high.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guhl, M., Bertrand, J., Fayette, L. et al. Uncertainty Computation at Finite Distance in Nonlinear Mixed Effects Models—a New Method Based on Metropolis-Hastings Algorithm. AAPS J 26, 53 (2024). https://doi.org/10.1208/s12248-024-00905-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1208/s12248-024-00905-x

Keywords

Navigation