Chapter 4: Effective Search Strategies for Systematic Reviews of Medical Tests

  • Rose Relevo
Open Access
Original Research


This article discusses techniques that are appropriate when developing search strategies for systematic reviews of medical tests. This includes general advice for searching for systematic reviews and issues specific to systematic reviews of medical tests. Diagnostic search filters are currently not sufficiently developed for use when searching for systematic reviews. Instead, authors should construct a highly sensitive search strategy that uses both controlled vocabulary and text words. A comprehensive search should include multiple databases and sources of grey literature. A list of subject-specific databases is included in this article.


systematic reviews bibliographic databases information retrieval 

Locating all published studies relevant to the key questions is a goal of all systematic reviews. Inevitably, systematic reviewers encounter variation in whether or how a study is published and in how the elements of a study are reported in the literature or indexed by organizations such as the National Library of Medicine. A systematic search must attempt to overcome these issues in order to identify all relevant studies, taking into account the usual constraints on time and resources.

Although I have written this article to serve as guidance for Evidence-based Practice Centers (EPCs), I also intend for this to be a useful resource for other investigators interested in conducting systematic reviews on medical tests; in particular this provides guidance for the librarian or information specialist conducting the search. Searching for genetic tests and prognostic studies is covered in papers 11 and 12 of this series.

While this paper will discuss issues specific to systematic reviews of medical tests (screening, diagnostic and prognostic), it is important to remember that general guidance on searching for systematic reviews1 also applies. Literature searches will always be a balance between recall (how much of the relevant literature is located) and precision (how much of the retrieved literature is relevant). The optimal balance depends on context. Within the context of comparative effectiveness research, the goal is to have a comprehensive (if not exhaustive) search while still trying to minimize the resources necessary for review of the retrieved citations.

In general, bibliographic searches should always include MEDLINE and the Cochrane Central Register of Controlled Trials. Additional databases that are often useful to search include EMBASE, CINAHL and PsychINFO. When constructing the searches in these bibliographic databases, it is important to use both controlled and uncontrolled vocabulary and to tailor the search for each individual database. Limits such as age and language should not be used unless a specific case can be made for their use.

Working closely with the research team as well as the analytic framework and inclusion and exclusion criteria will help to develop the search strategy. Reading the references of all included studies is a useful technique to identify additional studies, as is using a citation database such as Scopus or Web of Science to find articles that have cited key articles. In addition to published literature, a comprehensive search will include looking for unpublished or “grey literature.” In the context of comparative effectiveness research regulatory information, clinical trial registries and conference proceedings/abstracts are the most useful sources for identifying data.


Systematic reviews of test strategies for a given condition require a search on each of the relevant test strategies under consideration. In conducting the search, systematic reviewers may use one of two approaches. The reviewers may search on all possible tests used to evaluate the given disease, which requires knowing all the possible test strategies available, or they may search on the disease or condition and then focus on medical test evaluation for that disease.

When a review focuses on specific named tests, searching is relatively straightforward. The names of the tests can be used to locate studies, and a specific search for the concept of diagnosis, screening or prognosis may not be necessary2,3. Because testing strategies are constantly evolving, using the strategy of relying on specific named tests may risk missing emerging approaches. Tests that measure a gene product may be associated with multiple diseases, so searching by test name alone may be insufficient. Searching for the target illness in addition to known test names, or alone if specific tests are unknown, is often advisable. However, searches for a disease or condition are broader searches and greatly increase the burden of work in filtering down to the relevant studies on medical test evaluation.


Principle 1: Do Not Rely on Search Filters Alone

Several search filters (sometimes called “hedges”), which are pre-prepared and tested searches that can be combined with searches on a particular disease or condition, have been developed to aid systematic reviewers evaluating medical tests. Most of these filters have been developed for MEDLINE®2, 3, 4, 5, 6. In particular, one filter7 is used in the PubMed® Clinical Queries for diagnosis (Table 1). Search filters have also been developed specifically for diagnostic imaging8 and for EMBASE®9,10.
Table 1

Diagnosis Clinical Query for PubMed




PubMed search string




(sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR diagnos*[Title/Abstract] OR diagnosis[MeSH:noexp] OR diagnostic* [MeSH:noexp] OR diagnosis,differential[MeSH:noexp] OR diagnosis[Subheading:noexp])




Unfortunately, although these search filters are useful for the casual searcher who simply needs some good articles on diagnosis, they are inappropriate for use in systematic reviews of clinical effectiveness. Several researchers6,11, 12, 13, 14 have reported that using these filters for systematic reviews may result in relevant studies being missed. Vincent found that most of the available filters perform better when they are being evaluated than when they are used in the context of an actual systematic review13; this finding is particularly true for studies published before 1990 because of non-standardized reporting and indexing of medical test studies.

In recent years, improved reporting and indexing of randomized controlled trials (RCTs) have made such trials much easier to find. There is reason to believe that reporting and indexing of medical test studies will similarly improve in the future12. In fact, Kastner and colleagues15 recently reviewed 22 systematic reviews of diagnostic accuracy published in 2006 to determine whether the PubMed Clinical Queries Filter for diagnosis would be sufficient to locate all the primary studies that the 22 systematic reviews had identified through traditional search strategies. Using these filters in MEDLINE and EMBASE, the authors found 99 percent of the articles in the systematic reviews they examined, and they determined that the missed articles would not have altered the conclusions of the systematic reviews. The authors therefore concluded that filters may be appropriate when searching for systematic reviews of medical test accuracy. However, until more evidence of their effectiveness is found, we recommend that searchers not rely on them exclusively.

Principle 2: Do Not Rely On Controlled Vocabulary (Subject Headings) Alone

It is important to use all known variants of the test name such as abbreviations, generic and proprietary names as well as international terms and spellings, when searching, and these may not all be controlled vocabulary terms. Because reporting and indexing of studies of medical tests is so variable, one cannot rely on controlled vocabulary terms alone3.

Using textwords for particular medical tests will help to identify medical test articles that have not yet been indexed or that have not been indexed properly2. Filters may suggest the sort of textwords that may be appropriate. Michel16 discusses appropriate MeSH headings and other terminology useful for searching for medical tests.

Principle 3: Search in Multiple Locations

As always—but in particular with searches for studies of medical tests—we advise systematic reviewers to search more than one database and to tailor search strategies to each individual database17. Because there can be little overlap between many databases18, 19, 20, failure to search additional databases carries a risk of bias21, 22, 23. For more information on potentially appropriate databases to use, see Table 2.
Table 2

Specalized Databases

Free databases



Topic coverage

C2-SPECTR (Campbell Collaboration’s Social, Psychological, Educational and Criminology Trials Register)

Trial Register for Social Sciences (similar to DARE)

ERIC (Education Resources Information Center)

Education, including the education of health care professionals as well as educational interventions for patients

IBIDS (International Bibliographic Information on Dietary Supplements)

Dietary supplements

ICL (Index to Chiropractic Literature)


NAPS (new Abstracts and Papers in Sleep)


OTseeker (Occupational Therapy Systematic Evaluation of Evidence)

Occupational therapy

PEDRo (Physiothrarpy Evidence Database)

Physical therapy


PTSD ad traumatic stress


Population, family planning and reproductive health


Biology and health sciences

RDRB (Research and Development Resource Base)

Medical education



Social Care Online

Social care including: healthcare, social work and mental health


Toxicology, environmental health, adverse effects

TRIS (Transportation Research Information Service)

Transportation research

WHO Global Health Library

International biomedical topics. Global Index Medicus

Subscription databases


Aging, health topics of interest to people over 50

AMED (Allied and Complimentary Medicine Database)

Complementary medicine and allied health

ASSIA (Applied Social Science Index and Abstracts)

Applied social sciences including: anxiety disorders, geriatrics, health, nursing, social work and substance abuse

BNI (British Nursing Index)

Nursing and midwifery


Child-related topics including child health

CINAHL (Cumulative Index to Nursing and Allied Health)

Nursing and allied health


Community issues including community health


Biomedical with and emphases on drugs and pharmaceuticals, more non-US coverage than MEDLINE


Nursing and allied health

Global Health

International health

HaPI (Health and Psychosocial Instruments)

Health and psychosocial testing instruments

IPA (international Pharmaceutical Abstracts)

Drugs and pharmaceuticals

MANTIS (Manual Alternative and Natural Therapy Index System)

Ostopathy, chiropractic and alternative medicine


Psychological literature

Sociological Abstracts

Sociology including: health and medicine and the law, social psychology and substance abuse and addiciton

Social Services Abstracts

Social services including: mental health services, gerontology and health policy

Until reporting and indexing are improved and standardized, a combination of highly sensitive searches and brute force article screening will remain the best approach for systematically searching the medical test literature6,11, 12, 13. However, this approach is still likely to miss relevant articles; therfore, authors should search additional sources of information. Citation tracking, the reading of references of relevant articles as well as identifying articles that cite key studies, is an important sources of additional citations24. Table 3 lists databases that are appropriate for tracking citations.
Table 3

Citation Tracking Databases



Subscription status

Google Scholar







Subscription required

Web of Science

Subscription required

In addition to bibliographic databases and citation analysis, regulatory documents are another potential source of information for systematic reviews of medical reviews. The FDA regulates many medical tests as devices. The regulatory documents for diagnostic tests are available on the FDA’s Device website:


As an example, in the AHRQ report, Testing for BNP and NT-proBNP in the Diagnosis and Prognosis of Heart Failure,25 the medical tests in question were known. Therefore, the search consisted of all possible variations on the names of these tests and did not need to include a search string to capture the diagnostic testing concept. By contrast, in the AHRQ report, Effectiveness of Noninvasive Diagnostic Tests for Breast Abnormalities,26 all possible diagnostic tests were not known. For this reason, the search strategy included a search string meant to capture the diagnostic testing concept, and this relied heavily on textwords. The actual search strategy used in PubMed to capture the concept of diagnostic tests was as follows: diagnosis OR diagnose OR diagnostic OR di[sh] OR “gold standard” OR “ROC” OR “receiver operating characteristic” OR sensitivity and specificity[mh] OR likelihood OR “false positive” OR “false negative” OR “true positive” OR “true negative” OR “predictive value” OR accuracy OR precision.


Key points are:
  • Diagnostic search filters—or, more specifically, the reporting and indexing of medical test studies upon which these filters rely—are not sufficiently well developed to be depended upon exclusively for systematic reviews.

  • If the full range of tests is known, one may not need to search for the concept of diagnostic testing; searching for the specific test using all possible variant names may be sufficient.

  • Combining highly sensitive searches utilizing textwords with hand searching and acquisition and review of cited references in relevant papers is currently the best way to identify all or most relevant studies for a systematic review.

  • Do not rely on controlled vocabulary alone.

  • Check Devices@FDA.


Conflict of Interest

The author declares that he/she does not have a conflict of interest.


  1. 1.
    Relevo, R. and H. Balshem, Finding evidence for comparing medical interventions: Agency for Healthcare Research and Quality (AHRQ) and the Effective Health Care program. J Clin Epidemiol, 2011.[epub before print]Google Scholar
  2. 2.
    Deville WL, Bezemer PD, Bouter LM. Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy. J Clin Epidemiol. 2000;53(1):65–9.PubMedCrossRefGoogle Scholar
  3. 3.
    van der Weijden T, et al. Identifying relevant diagnostic studies in MEDLINE. The diagnostic value of the erythrocyte sedimentation rate (ESR) and dipstick as an example. Fam Pract. 1997;14(3):204–8.PubMedCrossRefGoogle Scholar
  4. 4.
    Bachmann LM, et al. Identifying diagnostic studies in MEDLINE: reducing the number needed to read. J Am Med Inform Assoc. 2002;9(6):653–8.PubMedCrossRefGoogle Scholar
  5. 5.
    Haynes RB, et al. Developing optimal search strategies for detecting clinically sound studies in MEDLINE. J Am Med Inform Assoc. 1994;1(6):447–58.PubMedCrossRefGoogle Scholar
  6. 6.
    Ritchie G, Glanville J, Lefebvre C. Do published search filters to identify diagnostic test accuracy studies perform adequately? Health Info Libr J. 2007;24(3):188–92.PubMedCrossRefGoogle Scholar
  7. 7.
    Haynes RB, Wilczynski NL. Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey. BMJ. 2004;328(7447):1040.PubMedCrossRefGoogle Scholar
  8. 8.
    Astin MP, et al. Developing a sensitive search strategy in MEDLINE to retrieve studies on assessment of the diagnostic performance of imaging techniques. Radiology. 2008;247(2):365–73.PubMedCrossRefGoogle Scholar
  9. 9.
    Bachmann LM, et al. Identifying diagnostic accuracy studies in EMBASE. J Med Libr Assoc. 2003;91(3):341–6.PubMedGoogle Scholar
  10. 10.
    Wilczynski NL, Haynes RB. EMBASE search strategies for identifying methodologically sound diagnostic studies for use by clinicians and researchers. BMC Med. 2005;3:7.PubMedCrossRefGoogle Scholar
  11. 11.
    Leeflang MM, et al. Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies. J Clin Epidemiol. 2006;59(3):234–40.PubMedCrossRefGoogle Scholar
  12. 12.
    Doust JA, et al. Identifying studies for systematic reviews of diagnostic tests was difficult due to the poor sensitivity and precision of methodologic filters and the lack of information in the abstract. J Clin Epidemiol. 2005;58(5):444–449.PubMedCrossRefGoogle Scholar
  13. 13.
    Vincent S, Greenley S, Beaven O. Clinical Evidence diagnosis: Developing a sensitive search strategy to retrieve diagnostic studies on deep vein thrombosis: a pragmatic approach. Health Info Libr J. 2003;20(3):150–9.PubMedCrossRefGoogle Scholar
  14. 14.
    Whiting P, et al. Inclusion of methodological filters in searches for diagnostic test accuracy studies misses relevant studies. J Clin Epidemiol. 2011;64(6):602–607.PubMedCrossRefGoogle Scholar
  15. 15.
    Kastner, M., et al., Diagnostic test systematic reviews: Bibliographic search filters ("clinical queries") for diagnostic accuracy studies perform well. J Clin Epidemiol, 2009.Google Scholar
  16. 16.
    Michel P, Mouillet E, Salmi LR. Comparison of Medical Subject Headings and standard terminology regarding performance of diagnostic tests. J Med Libr Assoc. 2006;94(2):221–3.PubMedGoogle Scholar
  17. 17.
    Honest H, Bachmann LM, Khan K. Electronic searching of the literature for systematic reviews of screening and diagnostic tests for preterm birth. Eur J Obstet Gynecol Reprod Biol. 2003;107(1):19–23.PubMedCrossRefGoogle Scholar
  18. 18.
    Conn VS, et al. Beyond MEDLINE for literature searches. J Nurs Scholarsh. 2003;35(2):177–82.PubMedCrossRefGoogle Scholar
  19. 19.
    Suarez-Almazor ME, et al. Identifying clinical trials in the medical literature with electronic databases: MEDLINE alone is not enough. Control Clin Trials. 2000;21(5):476–487.PubMedCrossRefGoogle Scholar
  20. 20.
    Betran AP, et al. Effectiveness of different databases in identifying studies for systematic reviews: experience from the WHO systematic review of maternal morbidity and mortality. BMC Med Res Methodol. 2005;5(1):6.PubMedCrossRefGoogle Scholar
  21. 21.
    Sampson M, et al. Should meta-analysts search Embase in addition to Medline? [see comment]. J Clin Epidemiol. 2003;56(10):943–55.PubMedCrossRefGoogle Scholar
  22. 22.
    Zheng MH, et al. Searching additional databases except PubMed are necessary for a systematic review. Stroke. 2008;39(8):e139. author reply e140.PubMedCrossRefGoogle Scholar
  23. 23.
    Stevinson C, Lawlor DA. Searching multiple databases for systematic reviews: added value or diminishing returns? Compl Ther Med. 2004;12(4):228–32.CrossRefGoogle Scholar
  24. 24.
    Whiting P, et al. Systematic reviews of test accuracy should search a range of databases to identify primary studies. J Clin Epidemiol. 2008;61(4):357–364.PubMedCrossRefGoogle Scholar
  25. 25.
    Balion, C., et al. Testing for BNP and NT-proBNP in the Diagnosis and Prognosis of Heart Failure. Evidence Report/Technology Assessment No. 142. (Prepared by the McMaster University Evidence-based Practice Center under Contract No. 290-02-0020). AHRQ Publication No. 06-E014. Rockville, MD: Agency for Healthcare Research and Quality. September 2006. Available at: Accessed August 7, 2011.
  26. 26.
    Bruening, W., et al. Effectiveness of Noninvasive Diagnostic Tests for Breast Abnormalities. Comparative Effectiveness Review No. 2. (Prepared by ECRI Evidence-based Practice Center under Contract No. 290-02-0019.) Rockville, MD: Agency for Healthcare Research and Quality. February 2006. Available at: Accessed August 7, 2011.

Copyright information

© Agency for Healthcare Research and Quality (AHRQ) 2012

Authors and Affiliations

  1. 1.Oregon Health & Science UniversityPortlandUSA

Personalised recommendations