Introduction

Standing side by side with scientists and public health officials, leaders and policy-makers announcing difficult public health measures and social restrictions in response to surging COVID-19 cases, vaccine side-effects, or business and social pressures have been at pains to explain that they are “following the science” and/or the “advice of health and medical experts”.

The vison of science and medicine embodied here is of its popular conceptualization: white coats, laboratories, stethoscopes, test tubes, highly controlled experiments and observation. Yet in reality, the science driving much of the decision-making likely looks very different. The modelling aspects of the scientific effort can resemble a dispersed network of mathematical and/or computational modellers drawing in multiple sources of data and expertise from across the public and academic realm, and deploying software models on remote, high-performance computing clusters. These efforts utilize information from several quarters (e.g. laboratory science, epidemiological evidence and behavioural science) and then consolidate and integrate these data alongside expert judgement and abstractions of social norms, dynamics, networks and structures, interactions, cognition and patterns of movement in efforts to generate an overall representation of the world, its mechanics and likely trajectory under different policy settings and conditions.

These representations are often agent-based models (ABMs) that mirror fine-grained artificial societies. Others may be system dynamics models, compartmental models, discrete event simulation models or other associated mathematical representations. Each approach has strengths, weaknesses and levels of sophistication and complexity [1].

We draw a somewhat artificial distinction [2] between mathematical and computational models here by describing mathematical models as those typically used to describe patterns in observed data and understand relationships between variables that may hold true into the future. By contrast, computational models are developed more to mirror dynamic processes and mechanisms presumed to drive system outcomes over time. Dependent upon the point and depth to which policy-makers might wish to observe, predict, forecast or intervene in a system, both mathematical and computational models can be effective tools for enabling understanding and exploration of complex, dynamic relationships between multiple factors that affect the operation of systems combining spatial, social, behavioural and biological processes (e.g. pandemics) [3]. They can be useful for enabling policy-makers to examine ideas and interventions, and test them within synthetic societies prior to taking real-world action [4]. So, while the COVID-19 pandemic has rapidly accelerated collaboration and progress across many research domains [5] (e.g. vaccine development), it has also produced a proliferation of computational public health models of the type intended for use by government, social care and health system managers to support better decision-making [6, 7].

However, despite their potential, some computational models (such as ABMs) have been regarded as so-called black boxes when compared with more standard compartmental models, which are not developed from a systems-thinking perspective. They can therefore fail to engage the trust of intended recipients [8]. This failure must be overcome if computational models and their potential benefits are to be embedded in regular decision-making processes, and for assisting leaders in making better-quality decisions.

Public health modelling amid crises

In the area of public health, models can be used as a tool for structured decision-making as well as informing, analysing, explaining, speculating and planning [9]. Models enable decision-makers to consider the potential impact of different variables (e.g. vaccination rates) and policy decisions (e.g. mask mandates, lockdowns) on defined population health outcomes [6, 10,11,12]. Whilst the desire from the public and policy-makers is often that models be predictive, realistically, prediction at fine-scale within complex, dynamic systems is a difficult if not impossible challenge [13]. Models are therefore perhaps more usefully embraced as tools for forecasting the likely pattern of outcomes that might emerge under various conditions over time [6]. They can therefore also be used to estimate the health and economic costs and benefits of different public health interventions [14].

The use of models to support public health decision-making is also nothing new. Multiple reports, articles and guidelines lay out clear, methodical means of model development with policy-makers that include principles of collaboration, participation and iteration [15,16,17,18,19]. However, most of these frameworks have been created under assumptions that both the modeller and user are not facing immediate crises that preclude lengthy development and collaboration cycles (i.e. a global pandemic).

Policy-makers facing novel, urgent crises with deep uncertainty and often limited scientific understanding or training [20,21,22,23] cannot politically or ethically wait for the passing of so-called normal scientific processes and maturation of evidence before acting [4, 24]. Policy-makers engaged with the science [25, 26] will quickly realize that formal models constructed amid emerging crises need to incorporate available and/or sufficient evidence more so than providing absolute certainty [17]. This point was highlighted in the early stages of the SARS-CoV-2 pandemic by Michael Ryan, Director of WHO’s Health Emergencies Programme, who noted, “If you need to be right before you move, you will never win. Perfection is the enemy of the good when it comes to emergency management” [24].

The key consideration for policy-makers facing crises is not to discard models that cannot deliver evidence with very high certainty, but to recognize (1) features of models that indicate they will be useful, and (2) that the evidence generated by them is both timely and robust enough to be acted upon.

We therefore recommend that policy-makers undertake a “rapid appraisal” of models made available to them. In doing this, we recommend they consider three elements of model utility:

  1. (1)

    instrumental utility, taking into account model

    1. (a)

      inputs

    2. (b)

      mechanisms, and

    3. (c)

      outputs;

  2. (2)

    conceptual utility; and

  3. (3)

    political utility [27].

We step through these elements below.

Instrumental utility requires the model to produce evidence that adequately answers questions posed of it by policy-makers—that is, it is “fit for purpose”. Policy-makers should feel as though they are “in” the model and can “drive” it by manipulating its policy levers and experimenting with scenarios. The model should look at a problem and attempt to solve it from the policy-maker’s perspective.

Evidence that has instrumental utility can be used to adjust or inform policy decisions facing governments and administrators (i.e. the evidence purports to show what policy levers are available, how and when those policies could be enacted, and what the consequences might be). This is the typical technical goal of modelling teams. A model that has instrumental utility is calibrated, valid and robust, and provides guidance that is as clear and accurate as possible given the uncertainty inherent in the crisis. As far as possible, the model should meet formal criteria for quality of theory, realism and mechanics, and objectivity as set out by the discipline(s) that contributed to its structure. It should be transparent, reproducible and able to generate results within an adequately short time frame for it to be used in crisis planning. It should demonstrate both internal and external validity.

The technical robustness of the model’s construction is just as valuable as its theoretical robustness, since minor technical errors can propagate into more significant errors, resulting in incorrect advice that, in turn, may lead to nonoptimal policy decisions. A model formally defined according to the criteria set out by the contributing discipline(s) should be implemented with formal testing and verification methods where possible [28]. Transparent model-checking [29] using formal methods is crucial in maintaining a model’s instrumental utility. For example, formal validation and verification used in electronic voting systems help maintain public trust and policy-maker support [30].

To assess the model’s instrumental utility, the user should broadly compare three elements of what they know of the real world with what they understand of the presented model world. They should ask what the concordance is between the real world and model inputs (e.g. features and elements being included in the model), mechanisms (i.e. the way the model works by combining model inputs and answers to questions of “what is going on here?”) and outputs (the range of outcomes generated by the model that describe its performance) (see Fig. 1). If there is a high degree of agreement across all three areas, the model is more likely to be a faithful representation of reality and its outcomes more likely to be trustworthy. Taking these elements individually, it is self-evident that poor input data will result in poor output data even if a model’s mechanics are sound. Poor model mechanics will turn even good input data into poor output data or misrepresent the way (i.e. policies) good policy outcomes might be achieved [16]. And poorly specified or limited outcome data (e.g. not appreciating unintended effects of policy decisions) will not provide policy-makers with adequate insight into the required range of outcomes they need (e.g. health and economic considerations of policy choices).

Fig. 1
figure 1

Qualities policy-makers should look for in models used to assist decision-making

Conceptual utility is the model’s capacity to be effectively communicated, thereby convincing the public, policy-makers, advisors and co-decision-makers that the evidence produced by the model is robust and explainable, makes sense and can be trusted. Evidence with conceptual utility can influence the perceptions of policy-makers, public servants and the general public around the need for policy action (i.e. the evidence is stark, easily communicated, transparent and convincing). The model and the evidence it provides should have so-called face validity for the public and those around the decision-making table. Model transparency as described above is important, as are the qualifications, expertise, backgrounds and track records of the model producers. The team or individuals who created the model should be known, credible and respected by their peers. They should be perceived as independent, free from conflicts of interest and/or from respected independent institutions. The model should broadly agree with evidence from other sources, but also generate occasional surprising but explainable moments of new insight for users. All in all, the model must deliver valuable insight that is not possible without it, and must be seen to provide credible, transparent evidence that can be trusted and defended.

For models to have political utility, their instrumental and conceptual utility must be strong enough to support defensible policy actions that will be supported by the community over extended time. Political utility enables the implications of the model to be actively and successfully woven into policy-making and desired actions of government. A model that does not have conceptual utility will not have political utility. A model that does not have instrumental utility will be found wanting through inaccuracy, so will also lose conceptual and eventually political utility. Once credibility is lost, trust in models will then be difficult to recover, and future efforts may be met with increased scepticism, leading to reluctance on the part of the community to comply with policy-makers’ directives.

Only when these three interdependent forms of utility derived from the model’s outputs are satisfied (see Fig. 2) are windows of opportunity [31] likely to open for scientific evidence derived through models to successfully integrate with public policy-making.

Fig. 2
figure 2

Three elements of instrumental, conceptual and political utility that assist models in being useful for policy-makers

Using these criteria, Table 1 provides a set of guidelines that policy-makers can use to evaluate whether any presented model approaches the thresholds for use in their local context.

Table 1 Features of models that provide utility to policy-makers in times of crisis

Conclusions

Computational modelling in public health can be an extremely low-cost, high-value exercise. In crisis situations where public health is at stake, models can have even greater utility. Despite this, computational models have not progressed far beyond the realm of “toys” in the minds of some policy-makers [8]. Worse, poor models or poor experiences with models—including overpromising predictive power—can result in the entire field being dismissed by policy-makers and the public as confusing, contradictory and untrustworthy [33].

To help alleviate this problem, we present a framework that asks both model developers and policy-makers to evaluate the utility of models across three related dimensions. This framework asks more from model developers who intend their work to be used for decision support in public health. We ask model developers to ensure their efforts are geared towards use by model users: that their models are fit for purpose, that they are transparent and comparable, and that they achieve this for the purposes of adoption and application in the real world. We suggest that to be adopted, they need to demonstrate qualities of instrumental, conceptual and political utility.

Computational modelling has the potential to act as a consolidating discipline that sits at the interface of science, public health research and public policy across the world. Greater integration between model developers and end-users on what creates a useful model is likely to enhance the understanding and utilization of computational models, as well as the models themselves.