Defining and measuring imaging appropriateness in low back pain studies: a scoping review



Patients with low back pain (LBP) rarely have serious underlying pathology but frequently undergo inappropriate imaging. A range of guidelines and red flag features are utilised to characterise appropriate imaging. This scoping review explores how LBP imaging appropriateness is determined and calculated in studies of primary care practice.


This scoping review builds upon a previous meta-analysis, incorporating articles identified that were published since 2014, with an updated search to capture articles published since the original search. Electronic databases were searched, and citation lists of included papers were reviewed. Inclusion criteria were studies assessing adult LBP imaging appropriateness in a primary care setting. Twenty-three eligible studies were identified.


A range of red flag features were utilised to determine imaging appropriateness. Most studies considered appropriateness in a binary manner, by the presence of any red flag feature. Ten guidelines were referenced, with 7/23 (30%) included studies amending or not referencing any guideline. The method for calculating the proportion of inappropriate imaging varied. Ten per cent of the studies used the total number of patients presenting with LBP as the denominator, suggesting most studies overestimated the rate of inappropriate imaging, and did not capture where imaging is not performed for clinically suspicious LBP.


Greater clarity is needed on how we define and measure imaging appropriateness for LBP, which also accounts for the problem of failing to image when indicated. An internationally agreed methodology for imaging appropriateness studies would ultimately lead to an improvement in the care delivered to patients.

Graphic abstract

These slides can be retrieved under Electronic Supplementary Material.


Low back pain (LBP) is the worldwide leading cause of disability and a common presentation to medical services, with an age-standardised point prevalence of 9.4% [1]. LBP is usually benign and self-limiting, but can be the presenting feature of serious spinal pathology such as malignancy, occurring in 1.4–5% of presentations [2, 3]. Choosing Wisely, an initiative established by the American Board of Internal Medicine to avoid unnecessary medical interventions, recommends that spinal imaging should be avoided in patients with no clear indicators of serious pathology and a duration of less than 6 weeks [4]. In addition to the economic cost, inappropriate imaging may lead to patients ascribing their pain to incidental imaging findings, increasing the likelihood of seeking unnecessary interventions [5, 6]. Failing to image when indicated may delay timely management of an underlying serious condition.

Multiple studies have investigated rates of inappropriate LBP imaging; estimates vary markedly ranging from 3.8 to 88.5% [7]. Appropriateness is often judged by red flags: clinical features thought to raise suspicion of serious pathology. However, the nature and number of red flags vary widely between guidelines [8]. Which red flags to include when assessing LBP imaging appropriateness is important, given the wide variation in their predictive value [9, 10]. In addition, studies vary in how they calculate the numerator and denominator to determine imaging appropriateness. Both these issues likely contribute to uncertainty on how much lumbar imaging is inappropriate.

Here we review the criteria studies use to assess appropriateness of LBP imaging in primary care, assess compliance with clinical guidelines, and how proportions of inappropriate imaging are calculated.


This scoping review built upon the work of a systematic review and meta-analysis by Jenkins et al. that identified studies assessing appropriateness of imaging for LBP [7]. This review included studies identified by Jenkins et al. published in the last 5 years (since 2014), plus studies from a repeat of the original search to identify subsequently published papers.

MEDLINE, EMBASE, and CINAHL were searched from 1st of January 2018 to 20th of February 2019, using the same search terms as Jenkins et al. [7]. Citation lists of included papers were also reviewed. Inclusion criteria were as follows: studies assessing LBP imaging appropriateness; studies in a primary care setting; and studies of adult patients. The following article types were excluded: case reports; case series; reviews; conference abstracts.

One author performed the initial title and abstract screen, identifying studies appropriate for a full-text review. These were combined with studies identified by Jenkins et al. in the last 5 years to give a complete list of eligible studies. Data extraction was performed by two authors using a data collection proforma. Detail was collected on the clinical setting a study was conducted in, the criteria to assess imaging appropriateness and its consistency with relevant national guidelines, how studies attained the information (i.e. chart review or insurance claims data), and the method by which studies calculated the proportion of appropriate imaging.


The electronic search identified 708 papers. A total of 674 were excluded after title and abstract review, leaving 34 for full manuscript review. A further 27 were excluded after full review. In total, seven studies eligible for inclusion were identified from the electronic search. These were combined with the 15 studies identified from the Jenkins et al. review published since 2014, and one further study identified from citation lists review, to give a total of 23 eligible studies. See Fig. 1 for more detail.

Fig. 1

Flow diagram of included studies

Table 1 describes the included studies, with detail on the criteria used to assess imaging appropriateness, guidelines followed or adapted by the study, and the source from which studies collected data.

Table 1 Included studies with details on indications to image

Fourteen of the 23 (61%) eligible studies considered prolonged symptom duration as an indication for imaging. Two of the nine studies that did not consider symptom duration included a trial of conservative therapy as an indication. Most studies (19/23, 83%) assessed imaging as inappropriate or appropriate in a binary manner, whereas four used a grading system to give an appropriateness score.

A broad range of red flags were utilised and can be stratified into three groups: (1) clinical features (23 in total); (2) suspicion of pathology (5 in total); and (3) past medical history (14 in total). The total number of red flags considered by each study ranged from 1 to 18. The most frequent clinical feature red flag was neurological impairment, present in 16/23 (70%) of studies. Age was used by seven studies but with inconsistent cut-offs. One study stratified red flags by age, with the combination of red flags required for imaging dependent on age (e.g. age over 60 years + history of trauma + female gender, corticosteroid use, or increased thoracic kyphosis). This was also the only study that utilised clusters of red flags, rather than relying on individual features [11]. The number of studies that endorsed each of the clinical feature red flags is detailed in Fig. 2.

Fig. 2

Bar chart displaying relative frequencies of clinical features used as red flags for LBP imaging

Far fewer studies (n = 4) considered clinical suspicion of a serious pathology as a red flag. Past medical history was used widely, but with marked variation. History of malignancy was employed in 19/23 (83%) of studies. Four studies limited to history of malignancy within the last year, with one study also excluding primary skin and prostate cancers. The frequency of clinical suspicion and past medical history red flags are in Table 2.

Table 2 Relative frequencies of suspicion of pathology and past medical history red flags

A total of 10 guidelines were referenced, with 16/23 (70%) assessing imaging appropriateness in line with a guideline. Seven studies combined, amended, or did not reference any guideline [19,20,21,22, 28, 30, 33].

The method of calculating the proportion of imaging that was inappropriate varied between studies. The two most common approaches, used in seven studies each, were:

$$\frac{{{\text{Number}}\,{\text{ of}}\,{\text{ inappropriate}}\,{\text{ imaging}}\,{\text{ requests}}\,{\text{ for}}\,{\text{ LBP}}}}{{{\text{Number}}\,{\text{ of}}\,{\text{ LBP }}\,{\text{imaging}}\,{\text{ requests}}}}$$


$$\frac{{{\text{Number }}\,{\text{of}}\,{\text{ inappropriate}}\,{\text{ imaging}}\,{\text{ requests }}\,{\text{for}}\,{\text{ LBP}}}}{{{\text{Number }}\,{\text{of}}\,{\text{ LBP }}\,{\text{patients }}\,{\text{not }}\,{\text{requiring}}\,{\text{ imaging}}}}$$

Two studies calculated the proportion of appropriate imaging decisions, allowing an estimation of when imaging had been inappropriately not performed, as well as performed [29, 30]. Four of the studies calculated the proportion of all patients presenting with LBP who had imaging, in order to compare interventions. Two studies calculated the number of inappropriate LBP imaging requests, as a proportion of all patients presenting with LBP, and one assessed the total number of LBP imaging requests without calculating a proportion.


This review highlights that widely varying criteria are employed to assess appropriateness of imaging for LBP. Most studies used red flag features to define imaging as appropriate, but the list of red flags varied substantially between studies. A Cochrane review assessed the predictive value of red flag features for spinal malignancy in patients presenting with LBP [34]. Frequently used red flag features such as age, neurological symptoms, and duration of symptoms had high false-positive rates, with only a previous history of malignancy having moderate predictive value. A further study assessed the performance of red flag features in predicting vertebral fracture, malignancy, infection, or cauda equina syndrome. Combinations of red flags performed well in predicting serious pathology; for example, a history of trauma in an individual older than 70 had a positive predictive value of 20.4. Night pain, pain that awakens a patient from sleep, did not predict any serious pathology [9]. Most studies identified in this review included at least one red flag with limited predictive value for serious pathology, and only one study measured against clusters of red flags.

Only a handful of the studies used clinician suspicion of serious pathology, rather than relying on the presence of individual red flag features. This approach requires clinical acumen and discretion but would provide clarity for clinicians.

Guidelines were inconsistently followed in the included studies, with nearly a third combining amending, or not following guidelines, and studies from similar geographical regions opting for different guidelines. While inter-regional variation in guideline choice is to be expected, variation in guideline choice between studies within a country is less expected, as is authors’ decisions to amend, combine, or not use guidelines at all.

This variation in approach renders comparability problematic: it is likely that a clinical case deemed appropriate for imaging in one study may well have been considered inappropriate in another study. It also undermines the substantial effort and resources put into creating guidelines in the first place.

The use of  varying methods to calculate the proportion of inappropriate imaging impacts on comparability. Seven of the studies assessed the proportion of appropriate imaging by dividing the number of inappropriate requests by the total number of imaging requests. This method when used alone is flawed, as it will not capture instances where imaging has been inappropriately not performed and will overestimate the proportion of inappropriate imaging. The following example explains this further:

In a study, 1000 people presented with LBP. A total of 100 underwent LBP imaging, 10 of which were deemed inappropriate.

If the total number of LBP imaging requests is used as the denominator, this would be construed as 10/100 (10%) of patients presenting with LBP having inappropriate imaging.

If the number of patients presenting with LBP is used as the denominator, one can see that the actual proportion of LBP patients undergoing inappropriate imaging was 10/1000 (1%).

This crucial limitation impacts on comparability between studies and prevents the identification of cases where patients with clinically suspicious LBP are not imaged. The two studies that included inappropriate non-imaging reported that nearly two-thirds of patients with clinically suspicious LBP were not imaged when they should have been [29, 30], suggesting this is a poorly recognised issue.

This scoping review identified studies from a broad electronic search of three databases, building upon the work of a previously published systematic review and meta-analysis [7], giving confidence that all eligible studies have been captured. The granularity of information extracted has enabled an in-depth comparison of how appropriateness of imaging for LBP is assessed, the degree to which studies follow clinical guidelines, and how studies calculate the proportion of inappropriate imaging for the first time.

The included studies had varying clarity when describing how appropriateness of imaging was assessed. If a guideline was cited with no further details, the reference was reviewed with appropriateness criteria extracted. As much detail as possible has been included, with review by a second author to reduce the likelihood that any information was omitted.

This review focuses on LBP imaging in primary care. The findings should not be generalised to secondary or specialist services, as it is likely that the practice in these settings will be substantially different, often with a higher index of suspicion of serious pathology, and greater clinical expertise.


Reducing inappropriate lumbar imaging is a very common Choosing Wisely recommendation but if we cannot agree on how to define and measure appropriateness, we do not know how big a problem there is or if progress is being made in solving the problem. Notably, the Choosing Wisely imaging recommendation does not consider the problem of failing to image when it is indicated.

Given its societal and economic impact, efficient assessment and management of LBP is crucial. To this end, care providers are increasingly embedding clinical decision support in online test ordering systems, but until the evidence base is clear as to which features should indicate imaging, their full benefit will not be realised. Further work and collaboration is urgently needed to identify and employ an internationally recognised methodology for defining and measuring imaging appropriateness for LBP.


  1. 1.

    Hoy D, March L, Brooks P, Blyth F, Woolf A, Bain C et al (2014) The global burden of low back pain: estimates from the Global Burden of Disease 2010 study. Ann Rheum Dis 73(6):968–974

  2. 2.

    Deyo RA, Rainville J, Kent DL (1992) What can the history and physical examination tell us about low back pain? JAMA 268(6):760–765

  3. 3.

    McGuirk B, King W, Govind J, Lowry J, Bogduk N (2001) Safety, efficacy, and cost effectiveness of evidence-based guidelines for the management of acute low back pain in primary care. Spine (Phila Pa 1976) 26(23):2615–2622

  4. 4.

    ABIM. Low back pain imaging 2019.

  5. 5.

    Flynn TW, Smith B, Chou R (2011) Appropriate use of diagnostic imaging in low back pain: a reminder that unnecessary imaging may do as much harm as good. J Orthop Sports Phys Ther 41(11):838–846

  6. 6.

    Lemmers GPG, van Lankveld W, Westert GP, van der Wees PJ, Staal JB (2019) Imaging versus no imaging for low back pain: a systematic review, measuring costs, healthcare utilization and absence from work. Eur Spine J 28(5):937–950

  7. 7.

    Jenkins HJ, Downie AS, Maher CG, Moloney NA, Magnussen JS, Hancock MJ (2018) Imaging for low back pain: is clinical use consistent with guidelines? A systematic review and meta-analysis. Spine J 18(12):2266–2277

  8. 8.

    Oliveira CB, Maher CG, Pinto RZ, Traeger AC, Lin CC, Chenot JF et al (2018) Clinical practice guidelines for the management of non-specific low back pain in primary care: an updated overview. Eur Spine J 27(11):2791–2803

  9. 9.

    Premkumar A, Godfrey W, Gottschalk MB, Boden SD (2018) Red flags for low back pain are not always really red: a prospective evaluation of the clinical utility of commonly used screening questions for low back pain. J Bone Jt Surg Am 100(5):368–374

  10. 10.

    Downie A, Williams CM, Henschke N, Hancock MJ, Ostelo RWJG, de Vet HCW et al (2013) Red flags to screen for malignancy and fracture in patients with low back pain: systematic review. BMJ: Br Med J 347:f7095

  11. 11.

    Suman A, Schaafsma FG, van de Ven PM, Slottje P, Buchbinder R, van Tulder MW et al (2018) Effectiveness of a multifaceted implementation strategy compared to usual care on low back pain guideline adherence among general practitioners. BMC Health Serv Res 18(1):358

  12. 12.

    Blackmore CC (2019) The relationship between medicare outpatient efficiency measure OP8 and lumbar MRI utilization. J Am Coll Radiol 16(3):276–281

  13. 13.

    Zafar HM, Ip IK, Mills AM, Raja AS, Langlotz CP, Khorasani R (2018) Effect of clinical decision support-generated report cards versus real-time alerts on primary care provider guideline adherence for low back pain outpatient lumbar spine MRI orders. Am J Roentgenol 212(2):386–394

  14. 14.

    Colla CH, Morden NE, Sequist TD, Mainor AJ, Li Z, Rosenthal MB (2018) Payer type and low-value care: comparing choosing wisely services across commercial and medicare populations. Health Serv Res 53(2):730–746

  15. 15.

    Wang KY, Yen CJ, Chen M, Variyam D, Acosta TU, Reed B et al (2018) Reducing inappropriate lumbar spine MRI for low back pain: radiology support, communication and alignment network. J Am Coll Radiol 15(1, Part A):116–122

  16. 16.

    Kullgren JT, Krupka E, Schachter A, Linden A, Miller J, Acharya Y et al (2018) Precommitting to choose wisely about low-value services: a stepped wedge cluster randomised trial. BMJ Qual Saf 27(5):355–364

  17. 17.

    Isaac T, Rosenthal MB, Colla CH, Morden NE, Mainor AJ, Li Z et al (2018) Measuring overuse with electronic health records data. Am J Manag Care 24(1):19–25

  18. 18.

    Rosenthal MB, Colla CH, Morden NE, Sequist TD, Mainor AJ, Li Z et al (2018) Overuse and insurance plan type in a privately insured population. Am J Manag Care 24(3):140–146

  19. 19.

    Allen H, Wright M, Craig T, Mardekian J, Cheung R, Sanchez R et al (2014) Tracking low back problems in a major self-insured workforce: toward improvement in the patient’s journey. J Occup Environ Med 56(6):604–620

  20. 20.

    Charlesworth CJ, Meath THA, Schwartz AL, McConnell KJ (2016) Comparison of low-value care in Medicaid vs commercially insured populations. JAMA Intern Med 176(7):998–1004

  21. 21.

    Gidwani R, Sinnott P, Avoundjian T, Lo J, Asch SM, Barnett PG (2016) Inappropriate ordering of lumbar spine magnetic resonance imaging: are providers Choosing Wisely? Am J Manag Care 22(2):e68–e76

  22. 22.

    Graves JM, Fulton-Kehoe D, Jarvik JG, Franklin GM (2014) Health care utilization and costs associated with adherence to clinical practice guidelines for early magnetic resonance imaging among workers with acute occupational low back pain. Health Serv Res 49(2):645–665

  23. 23.

    Hong AS, Ross-Degnan D, Zhang F, Wharam JF (2017) Clinician-level predictors for ordering low-value imaging. JAMA Int Med 177(11):1577–1585

  24. 24.

    Ip IK, Gershanik EF, Schneider LI, Raja AS, Mar W, Seltzer S et al (2014) Impact of IT-enabled intervention on MRI use for back pain. Am J Med 127(6):512-8.e1

  25. 25.

    Kennedy SA, Fung W, Malik A, Farrokhyar F, Midia M (2014) Effect of governmental intervention on appropriateness of lumbar MRI referrals: a Canadian experience. J Am Coll Radiol JACR 11(8):802–807

  26. 26.

    Kost A, Genao I, Lee JW, Smith SR (2015) Clinical decisions made in primary care clinics before and after choosing wisely™. J Am Board Fam Med 28(4):471–474

  27. 27.

    Lin IB, Coffin J, O’Sullivan PBJBFP (2016) Using theory to improve low back pain care in Australian Aboriginal primary care: a mixed method single cohort pilot study. BMC Fam Pract 17(1):44

  28. 28.

    Mohammadi N, Farahmand F, Hadizadeh Kharazi H, Mojdehipanah H, Karampour H, Nojomi M (2016) Appropriateness of physicians’ lumbosacral MRI requests in private and public centers in Tehran, Iran. Med J Islam Repub Iran 30:415

  29. 29.

    Rao S, Rao S, Harvey HB, Avery L, Saini S, Prabhakar AM (2015) Low back pain in the emergency department-are the ACR appropriateness criteria being followed? J Am Coll Radiol JACR 12(4):364–369

  30. 30.

    Schlemmer E, Mitchiner JC, Brown M, Wasilevich E (2015) Imaging during low back pain. Am J Emerg Med 33(3):414–418

  31. 31.

    Tahvonen P, Oikarinen H, Niinimäki J, Liukkonen E, Mattila S, Tervonen O (2017) Justification and active guideline implementation for spine radiography referrals in primary care. Acta Radiol 58(5):586–592

  32. 32.

    Tan A, Zhou J, Kuo Y-F, Goodwin JS (2016) Variation among primary care physicians in the use of imaging for older patients with acute low back pain. J Gen Int Med 31(2):156–163

  33. 33.

    Thackeray A, Hess R, Dorius J, Brodke D, Fritz J (2017) Relationship of opioid prescriptions to physical therapy referral and participation for Medicaid patients with new-onset low back pain. J Am Board Fam Med 30(6):784–794

  34. 34.

    Henschke N, Maher CG, Ostelo RW, de Vet HC, Macaskill P, Irwig L (2013) Red flags to screen for malignancy in patients with low-back pain. Cochrane Database Syst Rev 2:Cd008686

Download references

Author information

Correspondence to Mark Yates.

Ethics declarations

Conflict of interest

James B. Galloway has received honoraria from Abbvie, Celgene, Janssen, Pfizer, and UCB. Mark Yates has received honoraria from UCB.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PPTX 144 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yates, M., Oliveira, C.B., Galloway, J.B. et al. Defining and measuring imaging appropriateness in low back pain studies: a scoping review. Eur Spine J (2020) doi:10.1007/s00586-019-06269-7

Download citation


  • Low back pain
  • Imaging appropriateness
  • Red flags
  • Guideline compliance