Background

It is a paradox that although most people are willing to participate in medical research, many research studies struggle to recruit participants [1]. Poor recruitment often lengthens studies, increasing the costs of research, and, at worst, results in underpowered inconclusive or abandoned studies [2]. One study in 2013 revealed that in the UK only 55% of studies achieved their target sample size and 45% required funding for a time-extension [3]. More recently, Tudur Smith et al. conducted an on-line Delphi survey of 48 clinical trials units (CTUs) and identified ‘Research into methods to boost recruitment in trials’ as the top priority in trial methodological research [4]. Although the challenge of participant recruitment has been well-recognized and frequently-stressed, strategies to deal with this problem currently are insufficiently evidence based. According to Treweek el al., only two approaches showed slight to moderate recruitment improvement with high-certainty evidence [5]. One of them is about adopting open rather than blinded placebo design and the other is to use a telephone reminder in addition to postal contact. Some researchers have turned to disease-specific registers to facilitate trial recruitment such as cancer register, diabetes register, which has obtained good results [6,7,8,9]. A more general approach – research register initiatives based on Electronic Health Records (EHRs) have been developed in different nations and countries with patient consent, e.g., ResearchMatch, [10] All of Us Research Program, [11] Discover-Now [12] and SHARE [13] and more recently without consent [14]. They provide recruitment support service by helping researchers identify eligible participants in the EHRs and recruit them into their studies.

SHARE, part of National Health Service Research Scotland, is an example of the EHR-based recruitment support service which seeks consent from people across Scotland for their EHRs to be utilized for identifying eligible participants for health care research [13, 15, 16]. The database was designed following consultation with patients and members of the general public [17]. Registrants are predominantly approached by SHARE in hospital waiting areas. As a result, the proportion of patients with long-term health conditions among SHARE members is higher than the general population. And that is corroborated by the data from the Scottish Burden of Disease Study 2016 [18] and the 2016 mid-year population estimate [19].

When a research team approaches SHARE for assistance with participant recruitment, the SHARE Studies Access Committee establishes that the study has ethical approval and that the number of participants requested is likely to be achievable. The Health Informatics Centre (HIC) then works with the researchers to create an optimal search algorithm to be implemented in the EHR database. HIC is a National Health Service Safe Haven where EHRs are hosted on a local server within a restricted, safe IT environment. Safe havens provide a platform for the use of NHS electronic data in research feasibility, delivery and pharmacovigilance [20, 21]. Currently the data most used are demographics, hospital admissions (SMR01), community drug prescriptions, laboratory test results and cancer register. The cohort derived from the database search are transferred to the SHARE tracker system. SHARE team contact those potentially eligible patients by phone, email or letter, provide information about the study and confirm their interest in participation. The SHARE tracker system contains contact information and a record of previous contacts and responses. SHARE registrants can set a contact cap which sets the maximum number of studies the registrants would like to be contacted about annually when they register with SHARE. Following contact, if potential candidates express an interest in the project then, with permission, their details are passed on to researchers for further eligibility screening and enrolment. Figure 1 shows the workflow of engaging with SHARE for participant recruitment.

Fig. 1
figure 1

Workflow of recruiting through SHARE [13].

Discover-Now, [12] ResearchMatch [10, 22] and ALL of Us [11, 23] are all similar recruitment support services. SHARE and Discover-Now are based in Scotland and England respectively while ResearchMatch and ALL of Us are national registers in the USA. Research Match is reported to have supported more than 700 studies [10, 22]. But so far there is no study which has explored the performance of these EHR-based participant recruitment methods in a systematic way.

The large SHARE registrant base has made it attractive to researchers. However, although SHARE has roughly been used for recruitment to 100 health research studies by 2020, it is still uncertain how successful a research infrastructure like SHARE is in terms of meeting recruitment targets and improving recruitment efficiency by reducing the number of people that researchers need to screen for different kinds of studies. Such information will help researchers decide if an EHR-based recruitment approach is likely to be successful for their type of study. It can also inform SHARE and the other research support groups of the strengths and weaknesses of the approach.

The current paper evaluates SHARE’s performance in participant recruitment for clinical studies and discusses how SHARE can improve in the future.

Methods

Registrant composition

The statistics of registrants’ characteristics were quoted from SHARE’s regular reports on age, gender, disease status and medication. Age and gender were recorded in the demographic dataset. Disease status was estimated based on the hospital admissions data and medication based on the community prescription data. The distribution of the characteristics was described with both raw numbers and percentages.

Recruitment performance-data from SHARE

The SHARE team provided recruitment data for the included projects from their tracking records. To be included, studies must have completed the recruitment process in SHARE before 2020 and be either a clinical trial or an observational study. Projects were excluded from recruitment data analysis because SHARE could not provide recruitment data due to missing/incomplete records. Missing records were caused by projects cancelled by researchers due to personal or study specific reasons. Incomplete records happened to some projects which did not progress if the researchers were not satisfied with the initial search result. Recruitment data were also incomplete for projects in which SHARE was only requested to send out email advertisements and was not further involved in recruitment. For each included project, the total number of participants requested by researchers, the number of potentially eligible participants identified through searching EHRs, the number of potentially eligible participants provided by SHARE to researchers and the number finally recruited through SHARE were presented. Recruitment performance was indicated by three metrics, i.e., the fraction of the number of the patients recruited through SHARE over the total number requested (percentage fulfilled), the percentage of the recruited among the provided (percentage provided and recruited), the percentage of the recruited among the identified (percentage identified and recruited).

Recruitment performance-survey of researchers

Since SHARE is often not the only source of recruitment, the data only reflect the recruitment requests and outcomes recorded by SHARE. It was considered worthwhile to conduct a survey of the SHARE users to investigate their study recruitment outcomes and their experiences of recruiting through SHARE. Thus, an online survey was conducted on Qualtrics [24] during Mar, 2020 and Apr, 2020 of the researchers of all the eligible projects which finished before 2020 for trials and observational studies. An email with a link to the online survey was sent to each researcher separately along with a participant information sheet. The researchers were asked how SHARE was involved in their studies, how they would rate SHARE for recruitment and their suggestions for improvement. They were also asked to report the target participant number of their study, the number of participants they received from SHARE, the number recruited through SHARE and through other recruitment methods as well. To ensure that researchers would be less likely to reject the survey due to worry about anonymity, all the survey questions were made optional, and the respondents could choose not to answer any question that they think might make them identifiable. The questionnaire and the participant information sheet are attached as Additional file 1. For each study which we received a response, we presented the raw numbers the researchers reported and calculated the fraction of the number recruited through SHARE over the total recruitment target (percentage fulfilled) and the percentage of the number recruited among the number received from SHARE (percentage provided and recruited).

Analysis

Centre and variation of recruitment performance across studies were shown by median and interquartile range. All the calculations were done in the statistical software R(3.6.1) [25].

Results

As of August 2020, there were 283,791 patients registering with SHARE. The distributions of their age, gender, disease status and medication type are shown in the Table 1. Overall, 6.4% of the population over 16 in Scotland registered with SHARE. Figure 2 indicates the percentages of SHARE registrants in the 16+ population in each health board across Scotland.

Table 1 Distribution of characteristics of the SHARE registrants
Fig. 2
figure 2

The proportion of SHARE registrants in the 16+ population in each health board across Scotland

Figure 3 is the flow diagram of including and excluding projects for recruitment data analysis. Forty-four projects for trials or observational studies finished before 2020. They were eligible for inclusion, but 17 were excluded from recruitment data analysis due to missing records because the projects were cancelled by researchers due to personal or study specific reasons. Two projects were excluded because they did not progress after the initial search for eligible patients in the EHRs. One project was excluded because SHARE was not involved in direct recruitment of patients, so recruitment data was incomplete. Finally, 24 projects were included for recruitment data analysis.

Fig. 3
figure 3

Diagram of including and excluding projects for recruitment data analysis

Table 2 shows the comparison of some characteristics between included and excluded projects. Overall, excluded projects are not too different from those included in terms of time, study type, recruiting region and research domain.

Table 2 Comparison of characteristics between included and excluded SHARE projects

The 24 projects included recruited study subjects in various regions across Scotland. The majority (11 projects) recruited participants in the East and 2 recruited all over Scotland. Tables 3 and 4 provide details for each project and by study type respectively in terms of the region covered, the study inclusion and exclusion criteria, the number of participants requested, the number of potentially eligible individuals found through searching registrants’ EHRs, the number of people provided to researchers, the number recruited into the study and the three metrics of recruitment performance calculated. The overall trend is also presented with median and interquartile for trial and observational study separately.

Table 3 Study description, recruitment outcomes and three derived recruitment performance metrics for 16 clinical trials
Table 4 Study description, recruitment outcomes and three derived recruitment performance metrics for 8 observational studies

Overall, 34.2% (interquartile range 13.3–45.1%) of the number of participants requested by researchers were fulfilled through SHARE. Further analysis of the cohort provided to researchers showed that the average recruitment rate was 29.3% (interquartile 20.6–52.4%). The number of potential candidates identified by searching EHRs ranged from 94 to 509 across studies and on average there were 237 people to be further screened for eligibility for a study.

We received 12/44 responses to the online survey (response rate 27%). Two researchers chose not to reveal their study names. There were three studies of which the names reported could not be matched with those recorded at SHARE. Among the remaining seven identifiable studies, four are trials and three are observational studies.

According to the survey results, two studies used SHARE as the principal means of participant recruitment. Six reported that they adopted SHARE as a supplementary recruitment method in addition to a variety of approaches such as other patient registers, direct clinical contact, mass media, social media. The other four studies turned to SHARE after their original recruitment strategy failed. SHARE was involved within the first three months of participant recruitment for five studies and during months 3–6 for another two studies. In the other five studies, SHARE was used in the late phase of the recruitment including two engaging SHARE after one year into the study.

The participant target and recruitment outcomes of different methods reported by researchers and the study information for identifiable studies are presented in the Table 5. The recruitment performance calculated according to outcomes reported by researchers are also presented in it. The overall fraction of the number recruited through SHARE over the study target number was 31.7% (interquartile 5.8–59.6%). The recruitment rate centred at 20.2% (interquartile 8.2–31.0%) in terms of the number of candidates received. Notably, the researcher of study Ω responded “impossible via other means” when asked about the difficulty encountered during participant recruitment while SHARE managed to find all the participants for it.

Table 5 Study description, recruitment outcomes reported by researchers and two derived performance metrics

Seven researchers (58.3%) agreed that the participants transferred to them by SHARE were either more eligible or about the same as those found through other methods. Eight respondents (66.7%) considered SHARE a quicker way to recruit participants compared with other means of recruitment. Whereas 58% of those surveyed gave less favourable responses citing the cost of recruiting through SHARE.

When asked for suggestions for improvement, the most frequently raised issue (mentioned by 3 respondents) can be summarized as the need to improve the eligibility of the candidates transferred. Researchers hoped to receive as few ineligible candidates as possible so that they would be able to save some resources on filtering out ineligible participants. Other than that, one of the researchers was very discontent with the cost charged by SHARE. One researcher requested more rapid responses from the SHARE team. Another researcher mentioned improving the ability to identify mild cognitive impairment. One respondent encouraged SHARE to continue enlarging the registrant base.

The raw data obtained from the survey are attached as Additional file 2.

Discussion

Based on the recruitment data at SHARE, the register can provide over one third of the number of participants requested by study teams on average. Generally, one eligible participant is recruited for every 3 to 4 candidates provided to researchers by SHARE. The recruitment performance derived from the survey is a little poorer compared to that from the SHARE data (31.7 and 20.2% regarding target fulfilment and effective recruitment of candidates provided respectively) and with more variation. The percentage fulfilment reported by researchers may seem lower since it was calculated against the total target number of a study rather than the number SHARE was requested for. It reflected how much SHARE contributed to the recruitment outcome of a study, but it might be an underestimate because researchers might have already recruited most of the participants needed through other methods then they wouldn’t request that many from SHARE. Overall, most researchers consider SHARE a quicker way to recruit participants and the candidates provided by SHARE same likely to be eligible for recruitment as other methods if not more.

SHARE has shown relatively good performance compared with other recruitment methods for most studies reported in the survey. Researchers need to screen 5 candidates from SHARE for one successful recruit on average and 75% screen fewer than 12. This recruitment performance is still encouraging compared to those reported by other studies. For example, a cardiovascular disease trial reported to screen 5 candidates per participant recruited with the method of offering £100 upon successful recruitment at first place [26]. The number of candidates screened in some other studies ranged between 8 and 26 adopting one or more recruitment strategies such as small financial incentive, church involvement, telephone reminder, text message [27,28,29,30].

There are several studies for which SHARE seems to have poorer performance than other recruitment methods, i.e., study V, study B, study I, study Φ. But these studies were reported to involve SHARE in their recruitment at a late stage, and all but study V claimed that other recruitment strategies were failing, which to some extent indicates the extreme difficulty of recruitment for these studies. Considering SHARE was approached later in the recruitment phase of these studies, it is understandable that SHARE contributed fewer participants than other methods. On the contrary, according to the data held by SHARE, the recruitment performance regarding the number of participants requested for study V was about 87%, which is quite good. However, studies which had highly restrictive inclusion and exclusion criteria posed a challenge when those criteria cannot be mapped to structured EHR data. SHARE mainly use hospital admissions data to identify disease status but some conditions are either not coded in the EHRs or incompletely coded such as menstrual bleeding required by study B and mild cognitive impairment required by study Φ. The relatively poor performance in study I is probably due to the reason that most hypertensive patients captured in secondary care data suffer from comorbidity which made them ineligible according to the exclusion criteria of the study and mild hypertensive patients are mostly seen by their general practitioners (GPs). Therefore, SHARE is unable to identify these patients from secondary care data and find as many eligible patients for researchers as they wanted for their study.

To solve these problems, it is important that SHARE access EHRs generated from as many patients’ health care encounters as possible including diagnosis during primary care, in-hospital medication records, discharge notes, etc. This modification would also contribute a solution to researchers’ request for more quality control on the eligibility of the candidates provided. Now, whether the cohort built from EHRs meet researchers’ criteria depends hugely on what kind of condition is under consideration, whether it is coded in hospital admissions data and how accurately it is recorded. Data currently in use are mainly coded according to International Classification of Diseases (ICD). ICD codes are widely criticized for lack of granularity or aetiological information and poor accuracy for identification of certain diseases [31]. There are a few studies which have proved that the accuracy of pre-screening can be significantly increased and workload can be reduced by using both clinical narrative data and coded data to search for potentially eligible candidates [31,32,33,34]. Incorporating more data sources is likely to overcome many of those issues and help SHARE improve. According to the SHARE team, primary care data is now in the process of being added to the data inventory.

The limitations of this study include the fact that the major source of the recruitment data was the SHARE management team and not all studies were analysed for recruitment performance so there might exist selection bias of providing data for studies with relatively good results. We additionally conducted a survey involving the research teams of all the eligible projects. Recruitment data collected from the survey complemented the evaluation of recruitment performance based on SHARE records. Another limitation of the study is the seemingly low survey response rate, though a response rate of 20–30% is considered acceptable for an online survey according to SurveyMonkey [35]. The survey was meant to let researchers vent their real thoughts without worry of being identified and their future studies being affected because of them giving negative feedback. Although the response rate was not too high, it was designed and conducted in a stringent way. We believe that we have successfully gathered miscellaneous researchers’ views on the effect of the service on their studies’ recruitment performance and have drawn valuable information from them. Other survey methods may have resulted in even lower response rates.

Conclusions

In conclusion, SHARE may be a valuable resource for recruiting participants for some clinical studies. Currently it has blind spots in identifying some symptoms and minor ailments due to the intrinsic lack of access to the data reflecting those healthcare encounters. With the continuing growing number of registrants and the potential of eased restriction of data access, SHARE can become an even more powerful tool for efficient and effective participant recruitment to health research.