The posterior predictive check (PPC) is a model evaluation tool. It assigns a value (pPPC) to the probability that the value of a given statistic computed from data arising under an analysis model is as or more extreme than the value computed from the real data themselves. If this probability is too small, the analysis model is regarded as invalid for the given statistic. Properties of the PPC for pharmacokinetic (PK) and pharmacodynamic (PD) model evaluation are examined herein for a particularly simple simulation setting: extensive sampling of a single individual's data arising from simple PK/PD and error models. To test the performance characteristics of the PPC, repeatedly, “real” data are simulated and for a variety of statistics, the PPC is applied to an analysis model, which may (null hypothesis) or may not (alternative hypothesis) be identical to the simulation model. Five models are used here: (PK1) mono-exponential with proportional error, (PK2) biexponential with proportional error, (PK2ε) biexponential with additive error, (PD1) Emax model with additive error under the logit transform, and (PD2) sigmoid Emax model with additive error under the logit transform. Six simulation/analysis settings are studied. The first three, (PK1/PK1), (PK2/PK2), and (PD1/PD1) evaluate whether the PPC has appropriate type-I error level, whereas the second three (PK2/PK1), (PK2ε/PK2), and (PD2/PD1) evaluate whether the PPC has adequate power. For a set of 100 data sets simulated/analyzed under each model pair according to a stipulated extensive sampling design, the pPPC is computed for a number of statistics in three different ways (each way uses a different approximation to the posterior distribution on the model parameters). We find that in general; (i) The PPC is conservative under the null in the sense that for many statistics, prob(pPPC≤α)<α for small α. With respect to such statistics, this means that useful models will rarely be regarded incorrectly as invalid. A high correlation of a statistic with the parameter estimates obtained from the same data used to compute the statistic (a measure of statistical “sufficiency”) tends to identify the most conservative statistics. (ii) Power is not very great, at least for the alternative models we tested, and it is especially poor with “statistics” that are in part a function of parameters as well as data. Although there is a tendency for nonsufficient statistics (as we have measured this) to have greater power, this is by no means an infallible diagnostic. (iii) No clear advantage for one or another method of approximating the posterior distribution on model parameters is found.
posterior predictive check p value model evaluation pharmacokinetics pharmacodynamics