European Spine Journal

, Volume 25, Issue 9, pp 2728–2733 | Cite as

The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment

  • Julio UrrutiaEmail author
  • Pablo Besa
  • Mauricio Campos
  • Pablo Cikutovic
  • Mario Cabezon
  • Marcelo Molina
  • Juan Pablo Cruz
Original Article



Grading inter-vertebral disc degeneration (IDD) is important in the evaluation of many degenerative conditions, including patients with low back pain. Magnetic resonance imaging (MRI) is considered the best imaging instrument to evaluate IDD. The Pfirrmann classification is commonly used to grade IDD; the authors describing this classification showed an adequate agreement using it; however, there has been a paucity of independent agreement studies using this grading system. The aim of this study was to perform an independent inter- and intra-observer agreement study using the Pfirrmann classification.


T2-weighted sagittal images of 79 patients consecutively studied with lumbar spine MRI were classified using the Pfirrmann grading system by six evaluators (three spine surgeons and three radiologists). After a 6-week interval, the 79 cases were presented to the same evaluators in a random sequence for repeat evaluation. The intra-class correlation coefficient (ICC) and the weighted kappa (wκ) were used to determine the inter- and intra-observer agreement.


The inter-observer agreement was excellent, with an ICC = 0.94 (0.93–0.95) and wκ = 0.83 (0.74–0.91). There were no differences between spine surgeons and radiologists. Likewise, there were no differences in agreement evaluating the different lumbar discs. Most differences among observers were only of one grade. Intra-observer agreement was also excellent with ICC = 0.86 (0.83–0.89) and wκ = 0.89 (0.85–0.93).


In this independent study, the Pfirrmann classification demonstrated an adequate agreement among different observers and by the same observer on separate occasions. Furthermore, it allows communication between radiologists and spine surgeons.


Disc degeneration Agreement study Disc degeneration classification Magnetic resonance imaging Lumbar spine 


Compliance with ethical standards

Conflict of interest

The authors have no conflict of interest to disclose.


  1. 1.
    Pfirrmann CW, Metzdorf A, Zanetti M, Hodler J, Boos N (2001) Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine (Phila Pa 1976) 26:1873–1878CrossRefGoogle Scholar
  2. 2.
    Shan Z, Li S, Liu J, Mamuti M, Wang C, Zhao F (2015) Correlation between biomechanical properties of the annulus fibrosus and magnetic resonance imaging (MRI) findings. Eur Spine J 24:1909–1916. doi: 10.1007/s00586-015-4061-4 CrossRefPubMedGoogle Scholar
  3. 3.
    Yu HJ, Bahri S, Gardner V, Muftuler LT (2014) In vivo quantification of lumbar disc degeneration: assessment of ADC value using a degenerative scoring system based on Pfirrmann framework. Eur Spine J. doi: 10.1007/s00586-014-3721-0 PubMedCentralGoogle Scholar
  4. 4.
    Muftuler LT, Jarman JP, Yu HJ, Gardner VO, Maiman DJ, Arpinar VE (2015) Association between intervertebral disc degeneration and endplate perfusion studied by DCE-MRI. Eur Spine J 24:679–685. doi: 10.1007/s00586-014-3690-3 CrossRefPubMedGoogle Scholar
  5. 5.
    Jarman JP, Arpinar VE, Baruah D, Klein AP, Maiman DJ, Tugan Muftuler L (2015) Intervertebral disc height loss demonstrates the threshold of major pathological changes during degeneration. Eur Spine J 24:1944–1950. doi: 10.1007/s00586-014-3564-8 CrossRefPubMedGoogle Scholar
  6. 6.
    Walter SD, Eliasziw M, Donner A (1998) Sample size and optimal designs for reliability studies. Stat Med 17:101–110CrossRefPubMedGoogle Scholar
  7. 7.
    Griffith JF, Wang YX, Antonio GE, Choi KC, Yu A, Ahuja AT, Leung PC (2007) Modified Pfirrmann grading system for lumbar intervertebral disc degeneration. Spine (Phila Pa 1976) 32:E708–E712. doi: 10.1097/BRS.0b013e31815a59a0 CrossRefGoogle Scholar
  8. 8.
    Hallgren KA (2012) Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol 8:23–34CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Fleiss J (1986) The design and analysis of clinical experiments. Wiley, New York, pp 1–31Google Scholar
  10. 10.
    Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174CrossRefPubMedGoogle Scholar
  11. 11.
    Urrutia J, Zamora T, Prada C (2016) The prevalence of degenerative or incidental findings in the lumbar spine of pediatric patients: a study using magnetic resonance imaging as a screening tool. Eur Spine J 25(2):596–601. doi: 10.1007/s00586-015-4099-3 CrossRefPubMedGoogle Scholar
  12. 12.
    Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, Roberts C, Shoukri M, Streiner DL (2011) Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol 64:96–106. doi: 10.1016/j.jclinepi.2010.03.002 CrossRefPubMedGoogle Scholar
  13. 13.
    Carrino JA, Lurie JD, Tosteson AN, Tosteson TD, Carragee EJ, Kaiser J, Grove MR, Blood E, Pearson LH, Weinstein JN, Herzog R (2009) Lumbar spine: reliability of MR imaging findings. Radiology 250:161–170. doi: 10.1148/radiol.2493071999 CrossRefPubMedGoogle Scholar
  14. 14.
    Weinstein JN, Tosteson TD, Lurie JD, Tosteson AN, Hanscom B, Skinner JS, Abdu WA, Hilibrand AS, Boden SD, Deyo RA (2006) Surgical vs nonoperative treatment for lumbar disk herniation: the spine patient outcomes research trial (SPORT): a randomized trial. JAMA 296:2441–2450. doi: 10.001//jama.296.20.2441 296/20/2441 [pii] CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Weinstein JN, Tosteson TD, Lurie JD, Tosteson AN, Blood E, Hanscom B, Herkowitz H, Cammisa F, Albert T, Boden SD, Hilibrand A, Goldberg H, Berven S, An H, Investigators S (2008) Surgical versus nonsurgical therapy for lumbar spinal stenosis. N Engl J Med 358:794–810. doi: 10.1056/NEJMoa0707136 CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Weinstein JN, Lurie JD, Tosteson TD, Hanscom B, Tosteson AN, Blood EA, Birkmeyer NJ, Hilibrand AS, Herkowitz H, Cammisa FP, Albert TJ, Emery SE, Lenke LG, Abdu WA, Longley M, Errico TJ, Hu SS (2007) Surgical versus nonsurgical treatment for lumbar degenerative spondylolisthesis. N Engl J Med 356:2257–2270. doi: 10.1056/NEJMoa070302 CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Weinstein JN, Lurie JD, Olson PR, Bronner KK, Fisher ES (2006) United States’ trends and regional variations in lumbar spine surgery: 1992-2003. Spine (Phila Pa 1976) 31:2707–2714. doi: 10.1097/01.brs.0000248132.15231.fe 00007632-200611010-00012 [pii] CrossRefGoogle Scholar
  18. 18.
    Mulconrey DS, Knight RQ, Bramble JD, Paknikar S, Harty PA (2006) Interobserver reliability in the interpretation of diagnostic lumbar MRI and nuclear imaging. Spine J 6:177–184. doi: 10.1016/j.spinee.2005.08.011 CrossRefPubMedGoogle Scholar
  19. 19.
    van Middendorp JJ, Audige L, Hanson B, Chapman JR, Hosman AJ (2010) What should an ideal spinal injury classification system consist of? A methodological review and conceptual proposal for future classifications. Eur Spine J 19:1238–1249. doi: 10.1007/s00586-010-1415-9 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Oner FC, Ramos LM, Simmermacher RK, Kingma PT, Diekerhof CH, Dhert WJ, Verbout AJ (2002) Classification of thoracic and lumbar spine fractures: problems of reproducibility. A study of 53 patients using CT and MRI. Eur Spine J 11:235–245. doi: 10.1007/s00586-001-0364-8 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Modic MT, Steinberg PM, Ross JS, Masaryk TJ, Carter JR (1988) Degenerative disk disease: assessment of changes in vertebral body marrow with MR imaging. Radiology 166:193–199. doi: 10.1148/radiology.166.1.3336678 CrossRefPubMedGoogle Scholar
  22. 22.
    Jensen TS, Sorensen JS, Kjaer P (2007) Intra- and interobserver reproducibility of vertebral endplate signal (modic) changes in the lumbar spine: the Nordic Modic Consensus Group classification. Acta Radiol 48:748–754. doi: 10.1080/02841850701422112 CrossRefPubMedGoogle Scholar
  23. 23.
    Wang Y, Videman T, Niemelainen R, Battie MC (2011) Quantitative measures of modic changes in lumbar spine magnetic resonance imaging: intra- and inter-rater reliability. Spine (Phila Pa 1976) 36:1236–1243. doi: 10.1097/BRS.0b013e3181ecf283 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Julio Urrutia
    • 1
    Email author
  • Pablo Besa
    • 1
  • Mauricio Campos
    • 1
  • Pablo Cikutovic
    • 2
  • Mario Cabezon
    • 2
  • Marcelo Molina
    • 1
  • Juan Pablo Cruz
    • 2
  1. 1.Department of Orthopaedic Surgery, School of MedicinePontificia Universidad Catolica de ChileSantiagoChile
  2. 2.Department of Radiology, School of MedicinePontificia Universidad Catolica de ChileSantiagoChile

Personalised recommendations