Background

Systematic reviews (SRs) are thorough reviews of the literature on a clearly outlined research question and are considered one of the most reliable types of studies in evidence-based medicine. Investigators are advised to search multiple academic databases and reference lists when conducting SRs [1, 2]. The Cochrane Handbook recommends searching at least Cochrane Central, MEDLINE and Embase as well as applying the MEDLINE search strategy, which should include a) a term for the health condition of interest, b) the intervention for evaluation, and c) the study design when conducting SRs of randomized controlled trials (RCTs) [2]. For qualitative evidence synthesis, Cochrane suggests using purposive sampling instead of the exhaustive approaches for quantitative research and recommends placing extra emphasis on searching for grey literature and in local databases [2]. An alternative to the traditional academic literature databases are academic search engines such as Google Scholar and Microsoft Academic Search. These search engines are free of charge and “crawl” the internet for relevant academic literature rather than search peer-reviewed published literature within a database. As a result, academic search engines can also find grey literature (documents not published by commercial publishers) such as academic theses and organization reports, reducing possible publication bias in a SR. [3] The process of searching through multiple academic databases and search engines can be tedious, as each has its own interface and requires separate search strings. For example, Boolean operators, phrase searching, truncation and use of parentheses can all differ between databases [4]. Therefore, to improve search quality, an information specialist’s involvement is generally recommended [2, 5, 6]. Investigators are naturally very interested in how many databases are necessary to achieve a suitable number of references when conducting a SR. However, it is equally important to know which databases will give the broadest search results and highest likelihood of unique references i.e. references not found elsewhere within a given field. These questions have been investigated earlier in qualitative research in general terms [7], within the field of depression [8] as well as in quantitative research [9,10,11,12,13,14,15] with one study exploring diabetes mellitus [16]. However, no previous studies have investigated SRs of qualitative research regarding diabetes mellitus. Diabetes mellitus is one of the most frequent chronic diseases in the twenty-first century, with a global prevalence estimated at 463 million people (9.3%) in 2019 and an estimated increase to 700 million (10.9%) by 2045 [17]. It is a disease that demands rigorous and comprehensive care, as patients need to control diet, exercise, medication and health check-ups with podiatrists, ophthalmologists and general practitioners or endocrinologists. At the same time, the patients do not necessarily sense the symptoms of the disease. Therefore, compliance is a substantial problem for this patient group [18], resulting in high occurrences of complications. It is essential to understand the barriers concerning the patients’ compliance. Unlike quantitative studies, qualitative studies offer an opportunity to understand the clinicians’, caregivers’, relatives’ and, most importantly, the patients’ point of view. In the field of diabetes mellitus, qualitative studies give insight into measures successful in maintaining compliance and impacting the lives of patients. This study aimed to investigate which combination of academic literature databases and academic search engines would result in the highest recall rate of references, when conducting SRs of qualitative research regarding diabetes mellitus. Furthermore, we aimed to investigate the current use of academic literature databases and search engines (hereafter jointly referred to as databases), information specialists and additional data gathering methods.

Methods

Inclusion and exclusion of SRs

SRs of qualitative research regarding diabetes mellitus were retrieved from the PubMed database for all entries before the day of inclusion (January 25, 2021). The search terms (“Qualitative Research” [MeSH]) and (“Diabetes Mellitus” [MeSH]) were combined using the Boolan operator “AND”, and the filter “Systematic reviews” was applied. Despite not applying language restrictions, the search only yielded English results. Likewise, no restriction to the year of publication was applied. SRs were systematically full text evaluated according to the inclusion and exclusion criteria. Inclusion criteria were SRs of either qualitative or mixed methods (both qualitative and quantitative) research regarding all subtypes of diabetes mellitus. Exclusion criteria were a) lack of a full list of databases searched for data collection in the SR, b) included references not extractable through the reference list or supplementary data, c) SRs which focused solely on other diseases than diabetes mellitus or d) SRs only quantitative in nature. Details about collected variables from each SR and the references are summarized in Table 1.

Table 1 Collected variables for the included SRs and references

Inclusion and exclusion of references from SRs

A list of all included references was extracted from each SR. Each reference was evaluated on whether it met the inclusion and exclusion criteria. Inclusion criterion was references included in one of the included SRs. Exclusion criteria were a) quantitative references included in mixed methods SRs, b) references of diseases other than diabetes mellitus included in SRs of multiple diseases and c) unpublished references. Figure 1 illustrates the inclusion process of the SRs and their references. All references were systematically hand searched in seven databases CINAHL, MEDLINE/PubMed, PsycINFO, Embase, Web of Science, Scopus, and Google Scholar. Social Science Citation Index (SSCI) was not investigated in this study, as it is one of six databases already included in Web of Science. MEDLINE, and PubMed were treated as one database because PubMed includes all MEDLINE references [21]. The references were initially searched by title. If the title search did not retrieve the reference, further searches, initially using the basic search functions and later using keywords, authors, and journals, were completed. For each reference, it was documented whether the reference was found and in which of the databases.

Statistical analyses

The number and frequency of databases searched were described in absolute numbers and mean, median, and interquartile range (IQR). Calculations for correlation of the number of databases and year published were performed using Poisson regression. Searches of reference lists were described in absolute numbers as well as median and range. The contribution of references from each individual database and the various combinations of their combined contribution were calculated as absolute numbers and recall or combined recall. Recall rates were calculated using the total number of included references retrieved by the database or database combination divided by the total number of included references, given in percentage. All calculations of recall rates and unique number of references per database were performed firstly with all seven databases and secondly with Google Scholar excluded, since Google Scholar’s precision in structured literature searches has been reported to be low despite high recall rates (the topic of which is further addressed in the Discussion section) [22, 23]. All statistical analyses were performed using RStudio for Windows (v. 4.0.2 RStudio v. 1.3.1093).

Results

Inclusion and exclusion of SRs and references

The initial search of PubMed, with the search syntax defined in the methods section, yielded a result of 35 SRs. Nine SRs were excluded, the process of which is detailed in Fig. 1. A total of 26 SRs met the inclusion criteria and were included in the study. All SRs included were published between 2010 and 2020. No correlations were found between the year of publication, and the number of databases searched. See Appendix 1 for an overview of included SRs. The 26 SRs contained a total of 707 references. Five references could not be extracted as two SRs included 85 references in total, but only listed 80 references in the reference lists. Two hundred one references were excluded (see Fig. 1 for further details), and 501 unique, qualitative studies concerning diabetes mellitus were included. A median (IQR) of 12.5 (6 to 24) references were included per SR.

Fig. 1
figure 1

Flow diagram of the data collection process. 1 References included from more than one SR. 2 Two SRs [19, 20] included 85 references in total, but listed only 80 references in the reference lists. COPD = Chronic Obstructive Pulmonary Disorders. HIV = Human Immunodeficiency Virus

Databases and their frequency of use

The mean and median number of databases searched by the SRs were five and four, respectively, with a range from two to nine databases (Fig. 2).

Fig. 2
figure 2

Number of databases searched by SRs of qualitative research regarding diabetes mellitus

The 26 SRs searched 28 different databases, of which 12 were reported more than once. MEDLINE/PubMed was the most searched database applied by all SRs (100%), followed by CINAHL, which was searched by 21 out of the 26 SRs (81%). Embase and PsycINFO were the third and fourth most searched databases, both searched by 12 SRs (46%) (Fig. 3).

Fig. 3
figure 3

Frequency of database use by the included 26 SRs

The use of information specialists and additional sources

Only one (4%) SR [24] involved an information specialist when choosing databases. Another SR [25] used a search filter developed by an information specialist, while the remaining 24 SRs did not mention using an information specialist. Eighteen of the 26 (69%) SRs searched reference lists of included articles (two of the 18 SRs did not present the results of these searches). This resulted in a median (IQR) of 2.5 (one to six) more references being included per SR than by database searches alone. In total, the 16 SRs included 48 references from searching reference lists of included articles. These 48 references are included in the total number of 501 references. Three SRs exclusively searched databases, while the remaining SRs, in addition to databases, hand searched journals, key authors, and other sources.

Unique references per database

The seven databases MEDLINE /PubMed, CINAHL, Embase, PsycINFO, Web of Science, Scopus, and Google Scholar were investigated individually. A total of 9 (1.8%) references were unique to only one of these seven databases. Table 2 shows the number of unique references for each database. Embase retrieved the highest number of unique references followed by CINAHL, Google Scholar and MEDLINE/PubMed. The databases were also investigated excluding Google Scholar, and in this case, CINAHL retrieved the highest number of unique references (n = 15), followed by Embase (n = 5), and MEDLINE/PubMed (n = 5).

Table 2 Number of unique references in each database

Search of databases and their overall recall

For each database and their combinations, the recall rates of the 501 individual references were calculated. The calculations are shown in Table 3. Google Scholar showed the highest overall recall rate (97%), and Scopus, the second-highest overall recall rate (92%). The seven databases had overall recall rates between 39 to 97%. The combination of Google Scholar and Embase retrieved the highest overall recall rate (99%) regarding combinations of two databases. Excluding Google Scholar, the combination of two databases with the highest overall recall rate was MEDLINE/PubMed and CINAHL with 96%. The combination of three databases with the highest overall recall rate was Google Scholar, Embase, and either MEDLINE/PubMed (99.6%) or CINAHL (99.6%). Excluding Google Scholar, the combination of three databases with the highest overall recall rate was MEDLINE/PubMed, Embase, and CINAHL (98.8%).

Table 3 Individual and combined recall rates of references in the included databases

Discussion

Our study underlines the importance of choosing the optimal combination of databases when conducting a qualitative SR regarding diabetes mellitus. It has previously been suggested that a SR must include at least 95% of the publications on any given subject to be acceptable [9]. We found that the combinations of MEDLINE/PubMed and CINAHL (96.4%) and MEDLINE/PubMed, CINAHL, and Embase (98.8%) yielded the highest overall recall rates (when combining two and three databases, respectively), with Google Scholar excluded from the analyses. However, other combinations of databases yielded corresponding recall rates and are expected to perform comparably. Furthermore, CINAHL retrieved the highest number of unique references (n = 15), followed by MEDLINE/PubMed (n = 5), and Embase (n = 5), when Google Scholar was excluded. Based on these findings, we recommend searching at least the combination of MEDLINE/PubMed and CINAHL, when conducting qualitative SRs regarding diabetes mellitus (applied by 20 of 26 SRs).

These results contrast a previous study concluding that the combination of Scopus, CINAHL and ProQuest Dissertations and Thesis Global (hereafter referred to as ProQuest) contributed to the highest number of unique references for qualitative SRs [7]. ProQuest was not investigated in this study, as unpublished references were excluded. However, this only comprised two references and ProQuest would therefore not be expected to retrieve a high number of unique references. In alignment with previous findings [7, 26], our data showed that CINAHL retrieved the highest number of unique references (excluding Google Scholar), therefore suggesting CINAHL to be highly relevant when searching literature for qualitative diabetes mellitus research. CINAHL focuses on nursing and allied health research, a content that may be too narrow when researching multi- or interdisciplinary health science literature. In these cases, multidisciplinary databases such as Scopus and Web of Science could prove higher yielding [27]. Therefore, the nature of the research questions should be carefully considered when deciding the optimum combination of databases.

Google Scholar had the highest individual overall recall rate in this study (97%) and adds further value with the identification of grey literature [3]. However, databases such as ProQuest and GreySource offer similar access to grey literature. Despite the advantages of Google Scholar, its precision in structured literature searches has previously reported to be low [22, 23]. Google Scholar has many significant limitations, including search expressions being limited to 256 characters, displaying a maximum of 1000 results of the complete results without explaining how the order of results has been made and no bulk export options. Therefore, this search engine has previously been assessed as inadequate as a standalone resource for data gathering, when conducting comprehensive search activities, such as SRs [3]. We recommend Google Scholar be used as a supplement to the traditional scientific database searches in order to enhance retrieval of unique or unpublished references.

The majority of SRs searched reference lists of the included articles, similar to previous findings [28]. It can be argued that a comprehensive search in the optimal combination of databases would render the search of reference lists redundant, which our data on overall recall rates supports. However, whether a reference is present in a database does not directly translate into it being found with a given search string. In conclusion, searching reference lists is a valid way of searching for additional references not found by database searches alone. Two of the 26 SRs used either an information specialist or a search filter developed by an information specialist. These results contradicts a prior quantitative study that reported 51% of SRs used a librarian, though only 64% of these SRs actually reported this use [5]. Although, our findings are insufficient in making recommendations, we recommend consulting an information specialist before conducting database searches due to the challenge of each database requiring different search strings.

MEDLINE and PubMed were treated as one database due to the major overlap of references to avoid misleading results. However, it might be relevant to treat them as independent databases when conducting academic literature searches. PubMed includes all MEDLINE references as well as up-to-date citations, books and book chapters, and references from journals not indexed in MEDLINE, such as PMC journals [14, 29]. The larger quantity of content in PubMed compared to MEDLINE (91% of PubMed content is indexed in MEDLINE [21],) might contribute to more relevant references when conducting a SR. On the other hand, PMC literature has been criticized for potentially reducing the quality of PubMed, due to its informal reevaluation process (prior to 2017), though most manuscripts in PMC are also published in MEDLINE indexed journals [21]. For these reasons, we recommend the use of PubMed over MEDLINE.

Limitations

This study has several limitations. Firstly, the SRs included in this study were found through the database PubMed. Other databases were not searched for SRs of qualitative research regarding diabetes mellitus. Secondly, the search string solely used MeSH terms, and because of this, may not have recovered all qualitative SRs of diabetes mellitus in the PubMed database contributing to selection bias. However, as this is an exploratory study and not a SR or meta-analysis, a sample of collectable data was assessed to be sufficient. Thirdly, since we only investigated the topic of SRs of qualitative research regarding diabetes mellitus, our results may not apply to other diseases or topic of research. Fourthly, not all databases were investigates in this study such as SSCI and British Nursing Index, which were the sixth and eighth most frequently searched databases. It is possible that combinations including these databases may have resulted in different conclusions. Fifthly, whether a reference is present in a database does not directly translate into whether it would have been found using a given search string. Therefore, our results may not be directly transferable to the search of references when conducting a SR. Sixthly, the recall rates in this study were derived from the references included in the SRs and not from the actual number of references available and relevant for the same SRs at their time of inclusion. There were likely relevant references on qualitative research on diabetes mellitus not included in the SRs. These references, if included in our study, might alter the results and recommendations for database selection.

Conclusions

We found that the combinations of MEDLINE/PubMed and CINAHL (96.4%) and MEDLINE/PubMed, CINAHL, and Embase (98.8%) yielded the highest overall recall rates (when Google Scholar was excluded from the analyses) of references included in SRs of qualitative research regarding diabetes mellitus. Other combinations of databases did, however, yield corresponding recall rates and are expected to perform comparably. Google Scholar can be a useful supplement to traditional scientific databases to ensure an optimal and comprehensive retrieval of relevant references, both academic and grey literature. Further research on the subject should try to establish whether our findings within the field of diabetes mellitus are similar to other disease areas within qualitative research.