Background

This paper addresses the question: ‘What is research impact and how might we measure it?’. It has two main aims, first, to introduce the general reader to a new and somewhat specialised literature on the science of research impact assessment and, second, to contribute to the development of theory and the taxonomy of method in this complex and rapidly growing field of inquiry. Summarising evidence from previous systematic and narrative reviews [17], including new reviews from our own team [1, 5], we consider definitions of impact and its conceptual and philosophical basis before reviewing the strengths and limitations of different approaches to its assessment. We conclude by suggesting where future research on research impact might be directed.

Research impact has many definitions (Box 1). Its measurement is important considering that researchers are increasingly expected to be accountable and produce value for money, especially when their work is funded from the public purse [8]. Further, funders seek to demonstrate the benefits from their research spending [9] and there is pressure to reduce waste in research [10]. By highlighting how (and how effectively) resources are being used, impact assessment can inform strategic planning by both funding bodies and research institutions [1, 11].

We draw in particular on a recent meta-synthesis of studies of research impact funded by the UK Health Technology Assessment Programme (HTA review) covering literature mainly published between 2005 and 2014 [1]. The HTA review was based on a systematic search of eight databases (including grey literature) plus hand searching and reference checking, and identified over 20 different impact models and frameworks and 110 studies describing their empirical applications (as single or multiple case studies), although only a handful had proven robust and flexible across a range of examples. The material presented in this summary paper, based on much more extensive work, is inevitably somewhat eclectic. Four of the six approaches we selected as ‘established’ were the ones most widely used in the 110 published empirical studies. Additionally, we included the Societal Impact Assessment despite it being less widely used since it has recently been the subject of a major EU-funded workstream (across a range of fields) and the UK Research Excellence Framework (REF; on which empirical work post-dated our review) because of the size and uniqueness of the dataset and its significant (?) international interest. The approaches we selected as showing promise for the future were chosen more subjectively on the grounds that there is currently considerable academic and/or policy interest in them.

Different approaches to assessing research impact make different assumptions about the nature of research knowledge, the purpose of research, the definition of research quality, the role of values in research and its implementation, the mechanisms by which impact is achieved, and the implications for how impact is measured (Table 1). Short-term proximate impacts are easier to attribute, but benefits from complementary assets (such as the development of research infrastructure, political support or key partnerships [8]) may accumulate in the longer term but are more difficult – and sometimes impossible – to fully capture.

Table 1 Philosophical assumptions underpinning approaches to research impact

Knowledge is intertwined with politics and persuasion. If stakeholders agree on what the problem is and what a solution would look like, the research-impact link will tend to turn on the strength of research evidence in favour of each potential decision option, as depicted in column 2 of Table 1 [12]. However, in many fields – for example, public policymaking, social sciences, applied public health and the study of how knowledge is distributed and negotiated in multi-stakeholder collaborations – the links between research and impact are complex, indirect and hard to attribute (for an example, see Kogan and Henkel’s rich ethnographic study of the Rothschild experiment in the 1970s, which sought – and failed – to rationalize the links between research and policy [13]). In policymaking, research evidence is rather more often used conceptually (for general enlightenment) or symbolically (to justify a chosen course of action) than instrumentally (feeding directly into a particular policy decision) [12, 14], as shown empirically by Amara et al.’s large quantitative survey of how US government agencies drew on university research [15]. Social science research is more likely to illuminate the complexity of a phenomenon than produce a simple, ‘implementable’ solution that can be driven into practice by incorporation into a guideline or protocol [16, 17], as was shown by Dopson and Fitzgerald’s detailed ethnographic case studies of the implementation of evidence-based healthcare in healthcare organisations [18]. In such situations, the research-impact relationship may be productively explored using approaches that emphasise the fluidity of knowledge and the multiple ways in which it may be generated, assigned more or less credibility and value, and utilised (columns 3 to 6 in Table 1) [12, 19].

Many approaches to assessing research impact combine a logic model (to depict input-activities-output-impact links) with a ‘case study’ description to capture the often complex processes and interactions through which knowledge is produced (perhaps collaboratively and/or with end-user input to study design), interpreted and shared (for example, through engagement activities, audience targeting and the use of champions, boundary spanners and knowledge brokers [2024]). A nuanced narrative may be essential to depict the non-linear links between upstream research and distal outcomes and/or help explain why research findings were not taken up and implemented despite investment in knowledge translation efforts [4, 6].

Below, we describe six approaches that have proved robust and useful for measuring research impact and some additional ones introduced more recently. Table 2 lists examples of applications of the main approaches reviewed in this paper.

Table 2 Examples of applications of research impact assessment frameworks

Established approaches to measuring research impact

The Payback Framework

Developed by Buxton and Hanney in 1996 [25], the Payback Framework (Fig. 1) remains the most widely used approach. It was used by 27 of the 110 empirical application studies in the recent HTA review [1]. Despite its name, it does not measure impact in monetary terms. It consists of two elements: a logic model of the seven stages of research from conceptualisation to impact, and five categories to classify the paybacks – knowledge (e.g. academic publications), benefits to future research (e.g. training new researchers), benefits to policy (e.g. information base for clinical policies), benefits to health and the health system (including cost savings and greater equity), and broader economic benefits (e.g. commercial spin-outs). Two interfaces for interaction between researchers and potential users of research (‘project specification, selection and commissioning’ and ‘dissemination’) and various feedback loops connecting the stages are seen as crucial.

Fig. 1
figure 1

The Payback Framework developed by Buxton and Hanney (reproduced under Creative Commons Licence from Hanney et al [70])

The elements and categories in the Payback Framework were designed to capture the diverse ways in which impact may arise, notably the bidirectional interactions between researchers and users at all stages in the research process from agenda setting to dissemination and implementation. The Payback Framework encourages an assessment of the knowledge base at the time a piece of research is commissioned – data that might help with issues of attribution (did research A cause impact B?) and/or reveal a counterfactual (what other work was occurring in the relevant field at the time?).

Applying the Payback Framework through case studies is labour intensive: researcher interviews are combined with document analysis and verification of claimed impacts to prepare a detailed case study containing both qualitative and quantitative information. Not all research groups or funders will be sufficiently well resourced to produce this level of detail for every project – nor is it always necessary to do so. Some authors have adapted the Payback Framework methodology to reduce the workload of impact assessment (for example, a recent European Commission evaluation populated the categories mainly by analysis of published documents [26]); nevertheless, it is not known how or to what extent such changes would compromise the data. Impacts may be short or long term [27], so (as with any approach) the time window covered by data collection will be critical.

Another potential limitation of the Payback Framework is that it is generally project-focused (commencing with a particular funded study) and is therefore less able to explore the impact of the sum total of activities of a research group that attracted funding from a number of sources. As Meagher et al. concluded in their study of ESRC-funded responsive mode psychology projects, “In most cases it was extremely difficult to attribute with certainty a particular impact to a particular project’s research findings. It was often more feasible to attach an impact to a particular researcher’s full body of research, as it seemed to be the depth and credibility of an ongoing body of research that registered with users” [28] (p. 170).

Similarly, the impact of programmes of research may be greater than the sum of their parts due to economic and intellectual synergies, and therefore project-focused impact models may systematically underestimate impact. Application of the Payback Framework may include supplementary approaches such as targeted stakeholder interviews to fully capture the synergies of programme-level funding [29, 30].

Research Impact Framework

The Research Impact Framework was the second most widely used approach in the HTA review of impact assessment, accounting for seven out of 110 applications [1], but in these studies it was mostly used in combination with other frameworks (especially Payback) rather than as a stand-alone approach. It was originally developed by and for academics who were interested in measuring and monitoring the impact of their own research. As such, it is a ‘light touch’ checklist intended for use by individual researchers who seek to identify and select impacts from their work “without requiring specialist skill in the field of research impact assessment” [31] (p. 136). The checklist, designed to prompt reflection and discussion, includes research-related impacts, policy and practice impacts, service (including health) impacts, and an additional ‘societal impact’ category with seven sub-categories. In a pilot study, its authors found that participating researchers engaged readily with the Research Impact Framework and were able to use it to identify and reflect on different kinds of impact from their research [31, 32]. Because of its (intentional) trade-off between comprehensiveness and practicality, it generally produces a less thorough assessment than the Payback Framework and was not designed to be used in formal impact assessment studies by third parties.

Canadian Academy of Health Sciences (CAHS) Framework

The most widely used adaptation of the Payback Framework is the CAHS Framework (Fig. 2), which informed six of the 110 application studies in the HTA review [33]. Its architects claim to have shaped the Payback Framework into a ‘systems approach’ that takes greater account of the various non-linear influences at play in contemporary health research systems. CAHS was constructed collaboratively by a panel of international experts (academics, policymakers, university heads), endorsed by 28 stakeholder bodies across Canada (including research funders, policymakers, professional organisations and government) and refined through public consultation [33]. The authors emphasise that the consensus-building process that generated the model was as important as the model itself.

Fig. 2
figure 2

Simplified Canadian Academy of Health Sciences (CAHS) Framework (reproduced with permission of Canadian Academy of Health Sciences [33])

CAHS encourages a careful assessment of context and the subsequent consideration of impacts under five categories: advancing knowledge (measures of research quality, activity, outreach and structure), capacity-building (developing researchers and research infrastructure), informing decision-making (decisions about health and healthcare, including public health and social care, decisions about future research investment, and decisions by public and citizens), health impacts (including health status, determinants of health – including individual risk factors and environmental and social determinants – and health system changes), and economic and social benefits (including commercialization, cultural outcomes, socioeconomic implications and public understanding of science).

For each category, a menu of metrics and measures (66 in total) is offered, and users are encouraged to draw on these flexibly to suit their circumstances. By choosing appropriate sets of indicators, CAHS can be used to track impacts within any of the four ‘pillars’ of health research (basic biomedical, applied clinical, health services and systems, and population health – or within domains that cut across these pillars) and at various levels (individual, institutional, regional, national or international).

Despite their differences, Payback and CAHS have much in common, especially in how they define impact and their proposed categories for assessing it. Whilst CAHS appears broader in scope and emphasises ‘complex system’ elements, both frameworks are designed as a pragmatic and flexible adaptation of the research-into-practice logic model. One key difference is that CAHS’ category ‘decision-making’ incorporates both policy-level decisions and the behaviour of individual clinicians, whereas Payback collects data separately on individual clinical decisions on the grounds that, if they are measurable, decisions by clinicians to change behaviour feed indirectly into the improved health category.

As with Payback (but perhaps even more so, since CAHS is in many ways more comprehensive), the application of CAHS is a complex and specialist task that is likely to be highly labour-intensive and hence prohibitively expensive in some circumstances.

Monetisation models

A significant innovation in recent years has been the development of logic models to monetise (that is, express in terms of currency) both the health and the non-health returns from research. Of the 110 empirical applications of impact assessment approaches in our HTA review, six used monetization. Such models tend to operate at a much higher level of aggregation than Payback or CAHS – typically seeking to track all the outputs of a research council [34, 35], national research into a broad disease area (e.g. cardiovascular disease, cancer) [3638], or even an entire national medical research budget [39].

Monetisation models express returns in various ways, including as cost savings, the money value of net health gains via cost per quality-adjusted life year (QALY) using the willingness-to-pay or opportunity cost established by NICE or similar bodies [40], and internal rates of return (return on investment as an annual percentage yield). These models draw largely from the economic evaluation literature and differ principally in terms of which costs and benefits (health and non-health) they include and in the valuation of seemingly non-monetary components of the estimation. A national research call, for example, may fund several programmes of work in different universities and industry partnerships, subsequently producing net health gains (monetised as the value of QALYs or disability-adjusted life-years), cost savings to the health service (and to patients), commercialisation (patents, spin-outs, intellectual property), leveraging of research funds from other sources, and so on.

A major challenge in monetisation studies is that, in order to produce a quantitative measure of economic impact or rate of return, a number of simplifying assumptions must be made, especially in relation to the appropriate time lag between research and impact and what proportion of a particular benefit should be attributed to the funded research programme as opposed to all the other factors involved (e.g. social trends, emergence of new interventions, other research programmes occurring in parallel). Methods are being developed to address some of these issues [27]; however, whilst the estimates produced in monetised models are quantitative, those figures depend on subjective, qualitative judgements.

A key debate in the literature on monetisation of research impact addresses the level of aggregation. First applied to major research budgets in a ‘top-down’ or macro approach [39], whereby total health gains are apportioned to a particular research investment, the principles of monetisation are increasingly being used in a ‘bottom-up’ [34, 3638] manner to collect data on specific project or programme research outputs. The benefits of new treatments and their usage in clinical practice can be built up to estimate returns from a body of research. By including only research-driven interventions and using cost-effectiveness or cost-utility data to estimate incremental benefits, this method goes some way to dealing with the issue of attribution. Some impact assessment models combine a monetisation component alongside an assessment of processes and/or non-monetised impacts, such as environmental impacts and an expanded knowledge base [41].

Societal impact assessment

Societal impact assessment, used in social sciences and public health, emphasises impacts beyond health and is built on constructivist and performative philosophical assumptions (columns 3 and 6 in Table 1). Some form of societal impact assessment was used in three of the 110 empirical studies identified in our HTA review. Its protagonists distinguish the social relevance of knowledge from its monetised impacts, arguing that the intrinsic value of knowledge may be less significant than the varied and changing social configurations that enable its production, transformation and use [42].

An early approach to measuring societal impact was developed by Spaapen and Sylvain in the early 1990s [43], and subsequently refined by the Royal Netherlands Academy of Arts and Science [44]. An important component is self-evaluation by a research team of the relationships, interactions and interdependencies that link it to other elements of the research ecosystem (e.g. nature and strength of links with clinicians, policymakers and industry), as well as external peer review of these links. Spaapen et al. subsequently conducted a research programme, Evaluating Research in Context (ERiC) [45], which produced the Sci-Quest model [46]. Later, they collaborated with researchers (who had led a major UK ESRC-funded study on societal impact [47]) to produce the EU-funded SIAMPI (Social Impact Assessment Methods through the study of Productive Interactions) Framework [48].

Sci-Quest was described by its authors as a ‘fourth-generation’ approach to impact assessment – the previous three generations having been characterised, respectively, by measurement (e.g. an unenhanced logic model), description (e.g. the narrative accompanying a logic model) and judgement (e.g. an assessment of whether the impact was socially useful or not). Fourth-generation impact assessment, they suggest, is fundamentally a social, political and value-oriented activity and involves reflexivity on the part of researchers to identify and evaluate their own research goals and key relationships [46].

Sci-Quest methodology requires a detailed assessment of the research programme in context and the development of bespoke metrics (both qualitative and quantitative) to assess its interactions, outputs and outcomes, which are presented in a unique Research Embedment and Performance Profile, visualised in a radar chart. SIAMPI uses a mixed-methods case study approach to map three categories of productive interaction: direct personal contacts, indirect contacts such as publications, and financial or material links. These approaches have theoretical elegance, and some detailed empirical analyses were published as part of the SIAMPI final report [48]. However, neither approach has had significant uptake elsewhere in health research – perhaps because both are complex, resource-intensive and do not allow easy comparison across projects or programmes.

Whilst extending impact to include broader societal categories is appealing, the range of societal impacts described in different publications, and the weights assigned to them, vary widely; much depends on the researchers’ own subjective ratings. An attempt to capture societal impact (the Research Quality Framework) in Australia in the mid-2000s was planned but later abandoned following a change of government [49].

UK Research Excellence Framework

The 2014 REF – an extensive exercise to assess UK universities’ research performance – allocated 20 % of the total score to research impact [50]. Each institution submitted an impact template describing its strategy and infrastructure for achieving impact, along with several four-page impact case studies, each of which described a programme of research, claimed impacts and supporting evidence. These narratives, which were required to follow a linear and time-bound structure (describing research undertaken between 1993 and 2013, followed by a description of impact occurring between 2008 and 2013) were peer-reviewed by an intersectoral assessment panel representing academia and research users (industry and policymakers) [50]. Other countries are looking to emulate the REF model [51].

An independent evaluation of the REF impact assessment process by RAND Europe (based on focus groups, interviews, survey and documentary analysis) concluded that panel members perceived it as fair and robust and valued the intersectoral discussions, though many felt the somewhat crude scoring system (in which most case studies were awarded 3, 3.5 or 4 points) lacked granularity [52]. The 6679 non-redacted impact case studies submitted to the REF (1594 in medically-related fields) were placed in the public domain (http://results.ref.ac.uk) and provide a unique dataset for further analysis.

In its review of the REF, the members of Main Panel A, which covered biomedical and health research, noted that “International MPA [Main Panel A] members cautioned against attempts to ‘metricise’ the evaluation of the many superb and well-told narrations describing the evolution of basic discovery to health, economic and societal impact” [50].

Approaches with potential for the future

The approaches in this section, most of which have been recently developed, have not been widely tested but may hold promise for the future.

Electronic databases

Research funders increasingly require principal investigators to provide an annual return of impact data on an online third-party database. In the UK, for example, Researchfish® (formerly MRC e-Val but now described as a ‘federated system’ with over 100 participating organisations) allows funders to connect outputs to awards, thereby allowing aggregation of all outputs and impacts from an entire funding stream. The software contains 11 categories: publications, collaborations, further funding, next destination (career progression), engagement activities, influence on policy and practice, research materials, intellectual property, development of products or interventions, impacts on the private sector, and awards and recognition.

Provided that researchers complete the annual return consistently and accurately, such databases may overcome some of the limitations of one-off, resource-intensive case study approaches. However, the design (and business model) of Researchfish® is such that the only funding streams captured are from organisations prepared to pay the membership fee, thereby potentially distorting the picture of whose input accounts for a research team’s outputs.

Researchfish® collects data both ‘top-down’ (from funders) and ‘bottom-up’ (from individual research teams). A comparable US model is the High Impacts Tracking System, a web-based software tool developed by the National Institute of Environmental Health Sciences; it imports data from existing National Institutes of Health databases of grant information as well as the texts of progress reports and notes of programme managers [53].

Whilst electronic databases are increasingly mainstreamed in national research policy (Researchfish® was used, for example, to populate the Framework on Economic Impacts described by the UK Department of Business, Innovation and Skills [54]), we were unable to identify any published independent evaluations of their use.

Realist evaluation

Realist evaluation, designed to address the question “what works for whom in what circumstances”, rests on the assumption that different research inputs and processes in different contexts may generate different outcomes (column 4 in Table 1) [55]. A new approach, developed to assess and summarise impact in the national evaluation of UK Collaborations for Leadership in Applied Health Research and Care, is shown in Fig. 3 [56]. Whilst considered useful in that evaluation, it was resource-intensive to apply.

Fig. 3
figure 3

Realist model of research-service links and impacts in CLAHRCs (reproduced under UK non-commercial government licence from [56])

Contribution mapping

Kok and Schuit describe the research ecosystem as a complex and unstable network of people and technologies [57]. They depict the achievement of impact as shifting and stabilising the network’s configuration by mobilising people and resources (including knowledge in material forms, such as guidelines or software) and enrolling them in changing ‘actor scenarios’. In this model, the focus is shifted from attribution to contribution – that is, on the activities and alignment efforts of different actors (linked to the research and, more distantly, unlinked to it) in the three phases of the research process (formulation, production and extension; Fig. 4). Contribution mapping, which can be thought of as a variation on the Dutch approaches to societal impact assessment described above, uses in-depth case study methods but differs from more mainstream approaches in its philosophical and theoretical basis (column 6 in Table 1), in its focus on processes and activities, and in its goal of producing an account of how the network of actors and artefacts shifts and stabilises (or not). Its empirical application to date has been limited.

Fig. 4
figure 4

Kok and Schuit’s ‘contribution mapping’ model (reproduced under Creative Commons Attribution Licence 4.0 from [57])

The SPIRIT Action Framework

The SPIRIT Action Framework, recently published by Australia’s Sax Institute [58], retains a logic model structure but places more emphasis on engagement and capacity-building activities in organisations and acknowledges the messiness of, and multiple influences on, the policy process (Fig. 5). Unusually, the ‘logic model’ focuses not on the research but on the receiving organisation’s need for research. We understand that it is currently being empirically tested but evaluations have not yet been published.

Fig. 5
figure 5

The SPIRIT Action Framework (reproduced under Creative Commons Attribution Licence from [58] Fig. 1, p. 151)

Participatory research impact model

Community-based participatory research is predicated on a critical philosophy that emphasises social justice and the value of knowledge in liberating the disadvantaged from oppression (column 5 in Table 1) [59]. Cacari-Stone et al.’s model depicts the complex and contingent relationship between a community-campus partnership and the policymaking process [60]. Research impact is depicted in synergistic terms as progressive strengthening of the partnership and its consequent ability to influence policy decisions. The paper introducing the model includes a detailed account of its application (Table 2), but beyond those, it has not yet been empirically tested.

Discussion

This review of research impact assessment, which has sought to supplement rather than duplicate more extended overviews [17], prompts four main conclusions.

First, one size does not fit all. Different approaches to measuring research impact are designed for different purposes. Logic models can be very useful for tracking the impacts of a funding stream from award to quantitised (and perhaps monetised) impacts. However, when exploring less directly attributable aspects of the research-impact link, narrative accounts of how these links emerged and developed are invariably needed.

Second, the perfect is the enemy of the good. Producing detailed and validated case studies with a full assessment of context and all major claims independently verified, takes work and skill. There is a trade-off between the quality, completeness and timeliness of the data informing an impact assessment, on the one hand, and the cost and feasibility of generating such data on the other. It is no accident that some of the most theoretically elegant approaches to impact assessment have (ironically) had limited influence on the assessment of impact in practice.

Third, warnings from critics that focusing on short-term, proximal impacts (however accurately measured) could create a perverse incentive against more complex and/or politically sensitive research whose impacts are likely to be indirect and hard to measure [6163] should be taken seriously. However, as the science of how to measure intervening processes and activities advances, it may be possible to use such metrics creatively to support and incentivise the development of complementary assets of various kinds.

Fourth, change is afoot. Driven by both technological advances and the mounting economic pressures on the research community, labour-intensive impact models that require manual assessment of documents, researcher interviews and a bespoke narrative may be overtaken in the future by more automated approaches. The potential for ‘big data’ linkage (for example, supplementing Researchfish® entries with bibliometrics on research citations) may be considerable, though its benefits are currently speculative (and the risks unknown).

Conclusions

As the studies presented in this review illustrate, research on research impact is a rapidly growing interdisciplinary field, spanning evidence-based medicine (via sub-fields such as knowledge translation and implementation science), health services research, economics, informatics, sociology of science and higher education studies. One priority for research in this field is an assessment of how far the newer approaches that rely on regular updating of electronic databases are able to provide the breadth of understanding about the nature of the impacts, and how they arise, that can come for the more established and more ‘manual’ approaches. Future research should also address the topical question of whether research impact tools could be used to help target resources and reduce waste in research (for example, to decide whether to commission a new clinical trial or a meta-analysis of existing trials); we note, for example, the efforts of the UK National Institute for Health Research in this regard [64].

Once methods for assessing research impact have been developed, it is likely that they will be used. As the range of approaches grows, the challenge is to ensure that the most appropriate one is selected for each of the many different circumstances in which (and the different purposes for which) people may seek to measure impact. It is also worth noting that existing empirical studies have been undertaken primarily in high-income countries and relate to health research systems in North America, Europe and Australasia. The extent to which these frameworks are transferable to low- or middle-income countries or to the Asian setting should be explored further.