We would like to thank Ramaekers et al. [1] for their interest in our publication, which highlights some of the challenges faced when validating complex, multi-disease models.

We welcome the discussion of feedback from the evidence review group appraisal. Based on the experience of submitting to the National Institute for Health and Care Excellence, updates have been made to the Core Obesity Model (COM) and the revised version is currently being used in other countries around the world. While there is uncertainty associated with certain aspects of the model predictions, the internal and external validation aimed to provide a degree of reassurance that those model predictions are generally consistent with external data.

Another important point is the definition of acceptable concordance between predicted and observed values, and we agree that this is a challenging aspect of model validation. Indeed, the International Society for Pharmacoeconomics and Outcomes Research/Society for Medical Decision Making guidelines emphasize that it is not feasible to quantify the desired level of accuracy for the predictions made by models, stating that “it is not possible to specify criteria that a model must meet to be declared ‘valid’, as if validity were a property of the model that applies to all of its applications and uses for all time” [2]. Additional statistical methods, like those described by Corro Ramos et al., may be of use; however, as stated by those authors, any approach would still require decision makers to establish a necessary accuracy level, which may rely on more factors than prediction accuracy versus empirical data per se [3].

Finally, we agree that uncertainty may arise from extrapolating short-term outcomes based on selected, published risk estimates, and that alternative options are limited. When risk equations are used to predict outcomes, it is assumed that the effect of a weight management intervention will result in a risk of an outcome, irrespective of baseline levels. However, a prerequisite to modelling legacy effects accounting for previous weight profiles is that such risk equations must be available. These aspects are particularly important when considering a multi-application model like the COM, which can evolve over time as new clinical evidence becomes available.

We hope that these additional discussions are a constructive adjunct to our manuscript and help to contextualize its role as a starting point from which future iterations can incorporate newer risk estimates, when available, and as the basis for cost-effectiveness analyses in specific clinical settings.