figure b

Introduction

Early detection of sight-threatening diabetic retinopathy (STDR) at a stage allowing timely intervention, through systematic programmes of screening, is universally recognised to be important in preventing visual impairment [1] and reducing its associated costs, but approaches vary greatly worldwide. The frequency of screening has to date been annual, based on consensus, and this remains the recommendation in major guidelines [2, 3]. However, the prevalence of diabetes is increasing rapidly [4], increasing the requirement for screening, and resources are stretched.

Extending the interval between screening episodes offers potential cost savings. Some developed countries have recommended or implemented 2 yearly and sometimes longer intervals for people at low risk of progression. Much evidence supporting extended intervals comes from observational studies from areas with low incidence rates of STDR [5,6,7,8] and from modelling studies [9]. Safety concerns have been highlighted by a recent systematic review that called for RCTs and cost-effectiveness evidence [10], and recent failures in cancer screening [11]. In addition, the feasibility of connecting large and disparate datasets is considered challenging [12].

Based on our previous incidence data [5] we designed an RCT (the Individualised Screening for Diabetic Retinopathy [ISDR] study) to investigate the safety, efficacy and cost-effectiveness of extending screening intervals in low-risk individuals with diabetes, with more frequent intervals for those at high risk. We used the emerging methodology and technologies of personalised risk prediction [13, 14] and data linkage to develop an individualised, variable-interval, risk-based screening approach. Individualised clinical care offers opportunities for improved patient engagement. We also wanted to test the feasibility and stability of linking routine data across varying National Health Service (NHS) domains in an integrated approach. We tested the hypothesis of equivalence between attendance rates, as a primary measure of safety, for individualised and annual screening.

Methods

Study design and participants

Individuals with diabetes attending for diabetic retinopathy screening were invited to participate in a single-site, two-arm, parallel-assignment, equivalence RCT conducted in all community screening clinics in the Liverpool Diabetic Eye Screening Programme, which is part of the English National Diabetic Eye Screening Programme. The rationale, design and methodology have been published elsewhere [15], and the protocol and statistical and health economics analysis plans are available online [16]. In brief, inclusion criteria comprised: age 12 years or older, attending for retinal screening during the recruitment period, registered with a participating general practitioner, with no retinopathy or retinopathy/maculopathy less than the definition of screen-positive diabetic retinopathy, gradable digital retinal images in both eyes and did not opt out of the data warehouse (see below).

The English National Screening Committee definition of a screen-positive result was used, comprising any of: (1) moderate preproliferative diabetic retinopathy (R2) (equivalent to moderate non-proliferative diabetic retinopathy) or worse (any of: multiple deep blot haemorrhages, venous beading, intraretinal microvascular abnormalities, or worse); (2) new proliferative diabetic retinopathy (R3A); (3) maculopathy (M1) (any of: exudates ≤1 disc diameter (DD) from the foveal centre, group exudates ≥1/2 disc area (DA) ≥1 DD from the foveal centre, haemorrhage ≤1 DD from the foveal centre if visual acuity ≥+0.30 log minimal angle of resolution); (4) ungradable images; or (5) other significant sight-threatening disease [17, 18]. The definition of STDR was met when either retinopathy or maculopathy, as defined above, was confirmed on clinical examination by a retinal specialist (R2, R3A and/or M1 in England).

A patient and public involvement (PPI) group was embedded in all aspects of the study design, delivery and interpretation. The Liverpool Clinical Trials Research Centre developed electronic case report forms, information systems, quality assurance systems and systems to minimise operational bias (see electronic supplementary material [ESM] Methods, Clinical Trials Research Centre procedures). Ethics approval was by Preston NHS Research Ethics Committee (14/NW/0034).

The trial opened on 1 May 2014. Follow-up was for a minimum 24 months plus a 90 day window to attend the screening invitation. Participants were recruited by trained researchers at their screening appointment and all provided written informed consent. For children aged 12–15 years, proxy consent was by the parent/guardian with, where appropriate, assent from the child. Trial management is described in ESM Methods, Trial management.

Participants were allocated 1:1 to annual screening (control arm, current care) or individualised, risk-based, variable-interval screening with recall at 6, 12 or 24 months for those at high, medium and low risk, respectively. A purpose-built, dynamic data warehouse linking primary and secondary care demographic, retinopathy and systemic risk factor data populated the baseline and follow-up electronic case report forms (OpenClinica, v3.12; OpenClinica, USA). Block randomisation generated by an independent statistician was conducted using a bespoke, validated electronic system at the Clinical Trials Research Centre, with stratification by clinic and age using random blocks of four and six for participants aged ≥16 years, and blocks of two for those aged <16 years to account for small numbers. Screening staff and clinical assessors were observer-masked to the intervention arm, risk calculation and screening interval.

Procedures

Each participant’s risk of becoming screen-positive was assessed by a risk-calculation engine (RCE) that was specifically developed for the RCT and is described in detail elsewhere [19]. Briefly, the RCE uses data on retinopathy levels and demographic and clinical risk factors from the local population and the individual to estimate the likelihood of progression for that individual over a given time period.

An RCE development dataset comprised 5 years’ retinopathy, demographic and clinical data to 4 February 2014 held in the ISDR data warehouse from 11,806 individuals with diabetes. Participants and their general practitioners had agreed to data sharing. The RCE is a Markov multi-state model, with states defined by retinopathy level (both eyes) and transitions dependent on risk factors including historical retinopathy data. Candidate risk factors were identified in collaboration with the PPI group and selected as informative using the corrected Akaike’s information criterion method. Risk factors in the development dataset that were identified and included in the model were age, time since diagnosis of diabetes, HbA1c, systolic BP and total cholesterol. The time periods of 6, 12 and 24 months and a risk threshold of 2.5% were selected as agreed with the PPI group. The RCE showed good discriminatory ability. Corrected AUCs for 6, 12 and 24 months were 0.88 (95% CI 0.83, 0.93), 0.90 (95% CI 0.87, 0.93) and 0.91 (95% CI 0.87, 0.94), respectively. Sensitivities and specificities for a 2.5% risk were, respectively, 0.61 and 0.93 for 6 months, 0.67 and 0.90 for 12 months, and 0.82 and 0.81 for 24 months. Using the 2.5% threshold, the corrected C-index for the model was 0.687 [19].

At each screening visit during the trial, the RCE calculated a participant’s risk of becoming screen-positive using automatic exchanges of retinopathy data from the screening software (OptoMize v4.3; EMIS Health, UK) and risk factor data held in the data warehouse and randomisation databases. The data warehouse was updated with clinical data from primary care every 2 months. Participants were allocated to a high-, medium- or low-risk group against the 2.5% threshold. The screening interval could change at each follow-up visit. Participants in the control arm continued with invitations to annual screening, with risk recorded for future analysis.

Participants who were screen-positive attended for slit-lamp biomicroscopy to determine the presence of STDR (true positive). Participants with a false-positive result were reconsented and re-entered the trial. Participants were free to withdraw consent at any time without providing a reason.

Outcomes

The primary outcome of attendance at first follow-up visit (6, 12 or 24 months) assessed the safety of individualised screening. Non-attendance was defined as failure to attend any appointment within 90 days of the follow-up invitation, irrespective of the number of invitations.

Secondary outcomes measuring safety and efficacy reported here include STDR, visual acuity (recorded as log of the minimum angle of resolution), visual impairment (visual acuity ≥+0.30 and ≥0.50), screen-positive results and rates of retinopathy treatment over the 24 months (see ESM Methods, Secondary outcomes). Quality-adjusted life-years (QALYs) were used to produce cost-effectiveness estimates.

Statistical analysis

Our primary hypothesis was that attendance rates at first follow-up would be equivalent in the two arms with a 5% equivalence margin. The estimated minimum sample size was 4460 (90% power, 2.5% one-sided type 1 error, assuming the same attendance rate in both arms and allowing for 6% per annum loss over 24 months). Further details, including of a sample size review during the recruitment phase, are in ESM Methods, Sample size. Our secondary hypothesis was that STDR detection was non-inferior in the individualised arm at a prespecified margin of 1.5%.

Primary equivalence and non-inferiority analyses followed a per-protocol approach supported by secondary intention-to-treat analyses [20]. Adherence to protocol for attendance was considered at the first follow-up visit and by 24 months (+90 days) for STDR. Multiple imputations generated using generalised linear models (GLMs) dependent on baseline characteristics (PROC multiple imputation; SAS v9.3; SAS Institute, USA), assessed the effect of missing values on both per-protocol and intention-to-treat datasets.

Within the three risk groups of the individualised arm, equivalence in attendance rates between the two arms and non-inferiority in detection of STDR were explored. Participants in the control arm were allocated to risk groups based on the RCE risks at baseline. GLMs were fitted with arm, level of risk and their interaction added as factors.

Health economics

The costs of routine screening were measured using a mixed micro-costing and observational health economics analysis over a 2 year time horizon (see ESM Methods, Costs). Societal costs, including participant and companion costs, collected using a bespoke questionnaire, comprised time lost from work (productivity losses) and travel and parking costs. A detailed workplace analysis, measuring resources and staff time to deliver the screening programme, was conducted at each screening centre. This ingredient-based, bottom-up approach enabled a current resource-based cost to be attributed to the cost of screening each individual, taking into account both attendees and the related costs of non-attendance. We estimated the additional costs of running the RCE using a screen population size of 22,000 (Liverpool). Treatment costs were excluded as the 2 year time horizon was felt to limit any inference that could be attributed to lifetime cost (see ESM Methods, Costs).

A sample of the first participants enrolled into the RCT (n = 868) completed the EuroQol Five-Dimension Questionnaire (EQ-5D) five-level version (EQ-5D-5L) [21] and the Health Utilities Index Mark 3 (HUI3) [22] questionnaire at baseline and follow-up visits. Health state utilities were mapped [23] from the EQ-5D-5L to the EQ-5D three-level version and used a UK population tariff [24]. We applied a relevant Canadian tariff [25] to health state classifications of the HUI3 in the absence of an English or UK valuation set. Discounting was not applied, as both costs and QALYs were assumed to be assigned and incurred on an annual basis. Further detail is available in ESM Methods, Utilities and quality-adjusted life-years.

A detailed description of the cost-effectiveness analysis is available in ESM Methods, Cost-effectiveness methodology. A 90 day attendance window was utilised with a further 90 days added at 24 months to allow for the compounding lag in scheduling (see ESM Figs 1 and 2). We conducted multiple imputation of chained equations using available case data and followed guidance for best practice [26]. QALYs were derived using AUC, and incremental effects were estimated through ordinary least squares regression (for the univariate distributions of complete cases) and seemingly unrelated regressions (for the joint distributions of multiply imputed sets) on baseline utilities. We present unadjusted estimates as sensitivity analyses. We bootstrapped these regressions to characterise sampling distributions and derive 95% bias-corrected CIs around trial arm means and mean differences [27]. Intention-to-treat analyses were conducted in Stata/SE (Release 16; StataCorp, USA) from an NHS/societal perspective, and post-multiple imputation analyses followed Rubin’s combination rules for estimation within multiply imputed sets [28].

Results

Figure 1 summarises the trial profile showing the numbers for eligibility, allocation and withdrawals, and per-protocol and intention-to-treat datasets for the primary analysis. From 1 May 2014, 4538 participants were enrolled; four withdrew after randomisation, requesting removal of their trial data. Reasons for non-consent are shown in ESM Table 1. Allocations were 2269 to the control arm and 2265 to the individualised-interval arm (198, 211 and 1856 in the high-, medium- and low-risk groups, respectively). Last follow-up was on 5 September 2018.

Fig. 1
figure 1

CONSORT 2010 flow diagram

The baseline characteristics of participants in the per-protocol dataset are shown in Table 1 (similar distributions for intention-to-treat are shown in ESM Table 2). Participants were aged 14–100 years (median 63 years), 60.4% were male, 94.6% were white and 88.5% had type 2 diabetes. Compared with the other two risk groups, those in the high-risk group were more likely to have type 1 diabetes, had a longer diabetes duration and higher HbA1c, and were less likely to have ever smoked. Proportions with any retinopathy by group within the individualised arm were 99.5%, 79.1% and 3.9% for those allocated to screening at 6, 12 and 24 months, respectively.

Table 1 Participant baseline characteristics by arm and screening interval allocation in 4503 participants in the per-protocol dataset

A total of 182 (4.0%) participants withdrew from the trial before the first follow-up: 25 (0.6%) withdrew consent, 15 (0.3%) discontinued the intervention and 142 (3.2%) were lost to follow-up (ESM Table 3). Withdrawals of consent were higher in the individualised arm (0.9% vs 0.2%). Loss to follow-up was higher in the individualised arm (101 [4.5%] vs 41 [1.8%]), probably exacerbated by the longer follow-up period of 24 months in the low-risk group (81.9% of the individualised arm). A total of 15 participants prematurely discontinued the intervention.

Attendance rates at first follow-up for the control and individualised arms were 84.7% (1883/2224) and 83.6% (1754/2097), respectively (difference in proportions −1.0 [95% CI −3.2, 1.2], per-protocol analysis). Against the predefined acceptability margin (5%), the two arms were regarded as equivalent (Fig. 2, Table 2). Protocol deviations resulting in exclusion from this analysis occurred in 31 participants in the individualised arm; no safety effect occurred (one participant was assigned to screening at 12 instead of 6 months and 30 were assigned to 6 or 12 months instead of 24 months). Similar results were obtained from the intention-to-treat analysis. Per-protocol and intention-to-treat analyses with multiple (Table 2) and simple (ESM Table 4) imputation confirmed equivalence in attendance rates between the two arms.

Fig. 2
figure 2

Difference in the proportion of participants attending the first follow-up visit between the two arms. ‘Overall’: primary analysis; ‘per risk group’: high, medium and low risk groups within the individualised arm. Point estimates and 95% confidence limits are provided. Vertical dashed lines indicate 5% predefined equivalence margin. Diff, difference; ITT, intention to treat; PP, per protocol

Table 2 Results of the test for equivalence in attendance rate at first follow-up visit and of non-inferiority in STDR detection within 24 months, based on the per-protocol, intention-to-treat and multiple imputation datasets

Figure 2 and Table 2 show the equivalence analysis within the individualised arm. Equivalence in attendance rates at the first follow-up visit was found for the low-risk group (control 85.7%, individualised 85.1%, difference −0.6% [95% CI −2.9, 1.7]). For the medium-risk group, the difference in attendance rates was also very small (control 81.7%, individualised 82.2%, difference 0.6% [95% CI −7.3, 8.4]); however, equivalence was not confirmed due to the relatively wide CI. Attendance rates were lower in the high-risk group (control 77.3%, individualised 72.3%, difference −5.0% [95% CI −13.6, 3.5]) and equivalence was not observed. The attendance rates observed over 12 months (≥1 attended appointment), however, were higher in the individualised arm (89.1%) compared with the control arm (77.3%). A post hoc analysis of attendance over 24 months gave similar results (ESM Table 5).

The mean number of appointments per person by baseline risk allocation over 24 months was 1.83, 1.06 and 0.85 in the high-, medium- and low-risk groups, respectively. At least one change in allocation from baseline was recorded as follows: high-risk group, 48/160 (30.0%) participants were changed to a longer screening interval; medium-risk group, 34/200 (17.0%) participants were changed to a shorter screening interval and 84/200 (42.0%) were changed to a longer interval; low-risk group, 142/1694 (8.4%) were participants changed to a shorter interval (ESM Table 6). Overall, 132 participants were switched to a longer screening interval and 176 to a shorter interval.

There was no evidence of a loss of ability to detect STDR over 24 months from baseline in the individualised arm (28/1956; 1.4%) compared with the control arm (35/2042; 1.7%), with a difference of −0.3 (95% CI −1.1, 0.5) (Table 2). Non-inferiority was found within the low-risk group (control 0.6%, individualised 0.2%, difference −0.3 [95% CI −0.9, 0.1]). Non-inferiority was not confirmed for the high- and medium-risk groups, probably because of small participant numbers. Similar results were obtained with intention-to-treat and multiple and simple imputation analyses.

Four participants required treatment within 6 months of being screen-positive: two for STDR (one in the control arm and one in the high-risk group) and two for reasons other than diabetic retinopathy.

Withdrawals, premature discontinuations and loss to follow-up within 24 months showed a similar distribution across the arms (ESM Table 7).

Further safety data are presented in ESM Results, Secondary safety outcomes, and ESM Table 8. We did not detect a clinically significant worsening of diabetes control or an increase in visual impairment when comparing the groups.

As a secondary outcome, we investigated the efficacy of individualised screening using data on numbers of attended appointments and rates of screen-positive events across 24 months (ESM Table 9). Overall, 43.2% fewer screening attendances were required in the individualised arm vs the control arm (2008 vs 3536). Higher rates of screen-positive events by screening episode attended were seen in the individualised arm (individualised 5.1% [102/2008], control 4.5% [160/3536]). Within the individualised arm, the high-risk group had the highest screen-positive rate (high 10.7% [34/317], medium 6.0% [15/249], low 3.7% [53/1442]). In the high-risk group, most of the screen-positive results were because of eye disease other than diabetic retinopathy; the rate of participants who were screen-positive for diabetic retinopathy was low, at 0.5% (7/1442). Screening episodes that detected STDR were earlier in the individualised compared with the control arm: 6–12 months 17.9% (5/28) vs 2.9% (1/35); 12–18 months 32.1% (9/28) vs 60.0% (21/35).

A total of 868 participants completed the health economics questionnaires. ESM Table 10 presents the summary costs (2019/2020 values) associated with the screening programme. The cost to the NHS was £28.73 per attendance and £12.73 per non-attendance, while additional productivity losses and out-of-pocket payments by the patient accounted for £9.00. Within-trial summary health economic and cost-effectiveness data over the 2 year time horizon are reported in Table 3, and additional data for the two arms in ESM Tables 11 and 12. Multiple imputation supported the strict dominance of individualised screening in terms of QALYs gained and cost savings. Here, we briefly summarise the results reporting conservative data from an analysis of complete case QALYs and multiple imputation costs. Mean incremental QALY scores did not show a statistically significant difference between the trial arms (EQ-5D 0.006 [95% CI −0.039, 0.06], EuroQol Visual Analogue Score [EQ-VAS] 0.004 [95% CI −0.049, 0.052] and HUI3 −0.017 [95% CI −0.083, 0.04]; Table 3), with agreement between societal preferences (EQ-5D/HUI3) and individual preferences (EQ-VAS). Incremental cost savings per participant with individualised screening were: NHS perspective £17.34 (95% CI 17.02, 17.67); societal perspective £23.11 (95% CI 22.73, 23.53); corresponding to a reduction in total programme costs of 20% (from £193,983 to £154,386) and 21% (from £248,114 to £195,348), respectively. The individualised arm showed incremental savings across all domains. The NHS perspective cost-effectiveness plane for the EQ-5D and HUI3 shows the dominance of the intervention arm in cost savings and expected maintenance of quality of life (Fig. 3). While the intention had been to report cost-effectiveness acceptability curves, the dominance in cost reduction of risk-based screening and little fluctuation in QALYs across all instruments rendered this metric uninformative, as the proportion cost-effective was inelastic to varying thresholds. See ESM Results, Health economics for further details.

Table 3 Within-trial intention-to-treat QALYs and costs per participant: individualised vs annual screening
Fig. 3
figure 3

Baseline-adjusted EQ-5D, HUI3 and NHS perspective incremental cost-effectiveness of individualised vs annual screening from 1000 iteration bootstraps

Discussion

Our study shows that individualised, risk-based, variable-interval screening appears to be safe. Attendance at the first follow-up visit was equivalent to that for annual screening and secondary safety findings on the detection of STDR, visual function and glycaemic control are supportive. Our approach reduced the number of appointments by more than 40%. There were no detectable effects on quality-of-life measures and convincing cost savings.

Important strengths of our study are its RCT design, size, independent oversight and direction from an expert PPI group. We used the emerging technology of risk calculation, based on integrating local clinical data in estimating risk, to introduce a personalised approach. We implemented a mixed Markov RCE into a clinical trial setting. Protocol deviations were few (1.4% for primary outcome) and there were only moderate numbers of withdrawals and loss to follow-up, which are inevitable in a large RCT such as this. Findings were similar in our per-protocol, intention-to-treat and multiple imputation analyses, which followed current guidance on equivalence studies [20].

Generalisation of our findings has some limitations. Participants were enrolled from a single programme that has been running for more than 30 years, and consequently had relatively low rates of baseline diabetic retinopathy and progression to STDR. Good glycaemic and BP control and a relatively low proportion of participants with type 1 diabetes (4.0%) might have biased the sample. Low rates of diabetic retinopathy have been reported in other similar settings [29, 30], but our results should be treated with caution in areas with a higher prevalence, poorer control of diabetes or wider ethnic group representation, or in programmes during the set-up process.

Our trial had only a 2 year time horizon, which is short in the context of a life-long condition. With a move to extended-interval screening in several countries, we wanted to provide high-quality RCT evidence on how people act when given risk-based, variable-interval screening. Previous studies have derived risk models and used them in validation studies but have not answered this question [31, 32]. To allow for two cycles for the low-risk group would have extended the study duration from 4 to 6 years. The low rates of retinopathy and STDR in the low-risk group suggest that our findings on effectiveness would be unlikely to change; most disease was detected in the high- and medium-risk groups (STDR by 24 months: high-risk group 37 [12.2%], medium-risk group 12 [3.6%], low-risk group 14 [0.4%]). The cost savings with variable-interval screening were substantial and are unlikely to be lost over a longer time horizon, but to continue and accrue year on year. Robust monitoring including fail-safe mechanisms, such as in our design, should be included in any future implementation of risk-based, variable-interval screening.

A move to longer screening intervals for people at low risk of diabetic retinopathy has previously been suggested [33,34,35], but without convincing evidence on safety [10]. Overall, 19.0% of people invited to take part in our study explicitly stated that they wished to remain on annual screening or did not want a change of interval (ESM Table 1). Health professionals fear that extending screening intervals may reduce perceptions of the importance of screening, leading to loss of engagement and worse diabetes care. We did not detect a worsening in glycaemic control. Our findings give substantial reassurance that a 24 month interval for a low-risk individuals with diabetes in a setting such as ours is safe. However, for resource-poor or rural settings in low- and middle-income countries, further research is required before longer intervals can be contemplated.

Using a risk-based approach allows personalisation, offering better targeting of high-risk groups and improved patient engagement. Around 15% of participants in this study had at least one change in their risk-based interval; 59% (118/200) of those allocated to attend annual screening (the current standard) experienced at least one change of interval (ESM Table 6). Aspelund and colleagues have developed a similar risk engine for diabetic retinopathy screening using modelling coefficients from the 1990s [36] and conducted external validation in a Dutch cohort [31]. The strength of our RCE is that it was populated with local data and can be regularly updated with current data to reflect changing local progression rates.

For our approach to be more widely adopted, assessment in other local and national screening programmes will be required. Some systems development from our research setting to an implementation environment will be required. Our parallel social science study demonstrated the acceptability of variable-interval, risk-based screening to individuals with diabetes and health professionals, provided that additional monitoring and fail-safe mechanisms are included [37]. Further evaluation should include the effect of factors such as the unexplained heterogeneity among screening programmes in England in terms of grading outcomes and screening uptake.

We targeted high-risk people using a 6 month screening interval. Attendance rates in this group were lower in the individualised arm (72.3%) compared with the control arm (77.3%), but the shorter interval allowed more frequent screening and earlier detection of disease. There were higher relative rates of STDR detection in the high- and medium-risk individualised groups (13.4% and 3.9%) compared with 1.7% in the control arm, with very low rates in the low-risk group (0.2%). Our study was powered for equivalence with all risk groups combined and not for the risk group comparisons. Despite the hypothesis of equivalence not being supported in the high-risk group, the attendance rates observed over a period of 12 months were considerably higher in the individualised arm compared with the control arm (89.1% vs 77.3%).

Including systemic clinical risk factors (HbA1c, systolic BP, lipids) adds value in several ways. It allows the introduction of a high-risk group with earlier detection of STDR. It also improves patient engagement by linking retinopathy to systemic control; our PPI group strongly advised that including clinical data reinforces the message that control is crucial in managing complications. Including systemic risk factors also improves the accuracy of identifying low-risk patients when compared with simpler stratification strategies as suggested for the UK [9], while maintaining a desirable level of sensitivity. A post hoc analysis of our RCT dataset estimated that a simpler stratification approach [9] would allocate 66.9% of participants at baseline to a 24 month screening interval (compared with 81.9% in ISDR) and 33.1% to 12 months (data not shown). We have observed that multivariate risk models tend to require a lower frequency of eye examinations, and consequently are likely to be more cost-effective than current care [38].

Adding systemic risk factors may not be feasible in many settings where it has proved difficult to reliably link primary and secondary care data because of issues with data ownership and IT system management. We overcame this through strong support from local health commissioners and primary-care research groups. We needed to develop bespoke data processing, imputation and data validation processes. We included this in our cost-effectiveness analysis.

Our data show convincing evidence that an individualised approach provides considerable cost savings compared with annual screening. In addition, moving to variable-interval, risk-based screening did not compromise participants’ quality of life. Incremental screening cost savings of £17.34 were achieved (NHS perspective), rising to £23.11 (societal perspective) per participant over the 2 years, a reduction in total programme costs of 20% and 21%, respectively. A key driver in achieving cost-effectiveness was the reduction in unnecessary appointments and efficient use of administrative time. In a screening population such as in Liverpool (22,909 invitations in 2018–2019), this may amount to annual savings in the region of £199,000. In England (screening population 2.76 million [2018–2019] [39]), this could amount to around £23.9 million in annual savings for the NHS, rising to £31.9 million from a societal perspective. Such resources could be used to target groups that are hard to reach and those at high risk of visual impairment, and to more cost-efficiently screen the expanding population of individuals with diabetes. Furthermore, those in low-risk groups would be spared the inconvenience and additional personal cost of attending superfluous appointments.

The large number of observations and the accuracy of the true resource cost of screening are strengths of our cost-effectiveness analysis. The work could have been further strengthened by taking a long-term time horizon as discussed above and including the costs of treatment and blindness averted. Collecting quality-of-life data from every participant in the study would also have strengthened the analysis; our sample size was chosen to minimise disruption in the screening clinic. While methods of multiple imputation involve varied assumptions, the agreement between our complete case and multiply imputed quality-of-life data, across instruments and adjustments, is encouraging in viewing individualised screening as a cost minimiser (see supporting discussion in ESM Discussion).

Our data on efficacy show a higher efficiency of the individualised approach, with a greater proportion of screening episodes being positive (5.1% vs 4.5% in the control arm). A number of benefits include a lower burden of appointments, earlier detection of STDR for people at high-risk and an increased capacity to see individuals who have been newly diagnosed with diabetes. Furthermore, in the era of personalised care, a shortened screening interval in people at high risk might increase the focus on risk factor control and engagement with screening.

For people identified by our RCE as being at low risk, the rates of screen-positive for diabetic retinopathy were very low at 0.5% and even lower for STDR at under 0.2%. This is likely to apply elsewhere, but should not be applied in territories without an established systematic screening programme, where the first-pass prevalence will be high. The design of our study and the concerns around safety restricted us to a maximum screening interval of 24 months. However, our data suggest that extending intervals beyond 2 years would be reasonable.

In conclusion, our study, the largest RCT performed to date in ophthalmology or screening, should reassure all stakeholders in diabetes care that extended and personalised interval screening can be safely and effectively introduced in established systematic screening programmes. Our evidence is over a 2 year time horizon, so for implementation the long time frame of the disease should be addressed by continuous monitoring of attendance, retinopathy rates and grading quality. It is also applicable to other settings where clinical data are available, such as in healthcare-delivery organisations and polyclinics. Where current recommendations are for annual screening, we provide evidence to support a move to variable intervals with substantial reductions in cost.