Skip to main content
Log in

External proficiency testing improves inter-scorer reliability of polysomnography scoring

  • Sleep Breathing Physiology and Disorders • Original Article
  • Published:
Sleep and Breathing Aims and scope Submit manuscript

Abstract

Purpose

This study evaluated whether or not polysomnography (PSG) inter-scorer reliability (ISR) across sleep centres could be improved by external proficiency testing (EPT), or by EPT combined with method alignment training.

Methods

Experienced scorers form 15 sleep centres were randomised to the following: (1) a control group, (2) a group that received a self-directed intervention of EPT reports (EPTPassive) or (3) a group that received an active intervention of method alignment training and EPT reports (EPTActive). Respiratory, arousal and sleep scoring ISR from sixteen PSG fragments were compared between groups across time.

Results

Among 30 scorers, there were no ISR changes in controls between baseline (BL) and 6 months (6 m). Both EPT groups showed ISR improvement from BL to 6 m for respiratory, arousal and sleep scoring (p < 0.05). Respiratory scoring back-transformed mean (95CI) proportion of specific agreement (PSA) for the EPTPassive group improved from 0.78 (0.72–0.84) to 0.80 (0.74–0.86) and for the EPTActive group from 0.80 (0.74–0.85) to 0.82 (0.76–0.88). Arousal scoring PSA for the EPTPassive group improved from 0.72 (0.66–0.77) to 0.74 (0.69–0.79) and for the EPTActive group from 0.71 (0.65–0.76) to 0.77 (0.72–0.82). Sleep scoring kappa for the EPTPassive group improved from 0.64 (0.58–0.69) to 0.73 (0.68–0.77) and for the EPTActive group from = 0.75 (0.71–0.80) to 0.80 (0.76–0.85). Overall, poorer performers achieved greater improvement.

Conclusion

External proficiency testing produced modest, statistically significant PSG inter-scorer reliability improvements among experienced scorers across sleep centres, with potential to improve clinical management of individual patients and increase research study statistical power.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Rechtschaffen A, Kales A. A manual of standardised terminology, techniques and scoring system for sleep stages of human subjects. Los Angeles: Brain Information Service, Brain Research Institute, UCLA; 1968.

  2. Bonnet M, Carley D, Carskadon M, Easton P, Guilleminault C, Harper R et al (1992) EEG arousals: scoring rules and examples: a preliminary report from the Sleep Disorders Atlas Task Force of the American Sleep Disorders Association. Sleep 15(2):173–84

    Article  Google Scholar 

  3. Flemons WW, Buysse D, Redline S, Oack A, Strohl K, Wheatley J et al (1999) Sleep-related breathing disorders in adults: recommendations for syndrome definition and measurement techniques in clinical research. Rep Am Acad Sleep Med Task Force Sleep 22(5):667–89

    Google Scholar 

  4. Iber C, Ancoli-Israel S, Chesson A, Quan S, for the American Academy of Sleep Medicine. The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications. 1st ed. Westchester, Ill: American Academy of Sleep Medicine; 2007.

  5. Berry RB, Quan SF, Abreu AR, Bibbs ML, DelRosso L, Harding SM, et al. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, Version 2.6. Darien, Illinois: American Academy of Sleep Medicine; 2020; [Accessed 13/1/2020]; Available from: http://www.aasmnet.org/scoringmanual/.

  6. Stepnowsky CJ Jr, Berry C, Dimsdale JE (2004) The effect of measurement unreliability on sleep and respiratory variables. Sleep 27(5):990–5. https://doi.org/10.1093/sleep/27.5.990

    Article  PubMed  Google Scholar 

  7. Arnold J, Fisher K, Shen Liao G, Nawabit R, Redline S, Rosenbalm T, et al. Reading center manual of operations. Cleveland, Ohio; 2002; [Accessed 21/07/2020]; Available from: https://jhuccs1.us/shhs/details/manual/rcmop/rcmopAug2002R%20linked%202007-01-16.pdf.

  8. Magalang UJ, Chen NH, Cistulli PA, Fedson AC, Gíslason T, Hillman DR et al (2013) Agreement in the scoring of respiratory events and sleep among international sleep centers. Sleep 36(4):591–596. https://doi.org/10.5665/sleep.2552

    Article  PubMed  PubMed Central  Google Scholar 

  9. Magalang UJ, Arnardottir ES, Chen NH, Cistulli PA, Gislason T, Lim D et al (2016) Agreement in the scoring of respiratory events among international sleep centers for home sleep testing. J Clin Sleep Med 12(1):71–77. https://doi.org/10.5664/jcsm.5398

    Article  PubMed  PubMed Central  Google Scholar 

  10. Centers for Medicare and Medicaid Services. Laboratory Requirements: Clinical Laboratory Improvement Amendments of 1988. 42 CFR 493. 2011; [Accessed 21 July 2020]; Available from: https://www.govinfo.gov/content/pkg/CFR-2011-title42-vol5/pdf/CFR-2011-title42-vol5-part493.pdf.

  11. International Organization for Standardization. Conformity assessment — General requirements for proficiency testing (ISO/IEC 17043:2010). Geneva, Switzerland: International Organization for Standardization; 2010.

  12. International Organization for Standardization. Statistical methods for use in proficiency testing by interlaboratory comparison (ISO 13528:2015). Geneva, Switzerland: International Organization for Standardization; 2015.

  13. Michael T, Stephen LRE, Roger W (2006) The International Harmonized Protocol for the proficiency testing of analytical chemistry laboratories (IUPAC Technical Report). Pure Appl Chem 78(1):145–196. https://doi.org/10.1351/pac200678010145

    Article  CAS  Google Scholar 

  14. Clinical and Laboratory Standards Institute. Using proficiency testing and alternative assessment to improve medical laboratory quality (CLSI guideline QMS24). 3rd ed. Wayne, PA: Clinical and Laboratory Standards Institute; 2016.

  15. American Academy of Sleep Medicine. Sleep ISR. 2020; [Accessed 22/07/2020]; Available from: https://isr.aasm.org/.

  16. QSleep (2021) QSleep external proficiency testing (EPT) program. Available from: https://qsleep.com.au/. Accessed 22 Dec 2021

  17. Rosenberg RS, Van Hout S (2013) The American Academy of Sleep Medicine inter-scorer reliability program: sleep stage scoring. J clin sleep med : JCSM : off publ Am Acad Sleep Med 9(1):81–87. https://doi.org/10.5664/jcsm.2350

    Article  PubMed  PubMed Central  Google Scholar 

  18. American Academy of Sleep Medicine. Facility Standards for Accreditation. 2020; [Accessed 9 November 2021]; Available from: https://j2vjt3dnbra3ps7ll1clb4q2-wpengine.netdna-ssl.com/wp-content/uploads/2019/05/AASM-Facility-Standards-for-Accreditation-8.2020.pdf.

  19. Australasian Sleep Association. Standard for Sleep Disorders Services (2019); Available from: https://sleep.org.au/common/Uploaded%20files/Public%20Files/Professional%20resources/Sleep%20Documents/ASA%20Standard%20for%20Sleep%20Disorders%20Services%20December%202018%20final%20Refs%20Ma.pdf. Accessed 9 Nov 2021

  20. Penzel T, Zhang X, Fietze I (2013) Inter-scorer reliability between sleep centers can teach us what to improve in the scoring rules. J Clin Sleep Med 9(1):89–91. https://doi.org/10.5664/jcsm.2352

    Article  PubMed  PubMed Central  Google Scholar 

  21. Bonnet M, Carley D, Carskadon M, Easton P, Guilleminault C, Harper R et al (1993) Recording and scoring leg movements. Atlas Task Force Sleep 16(8):748–759

    Google Scholar 

  22. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46

    Article  Google Scholar 

  23. Fleiss JL. The measurement of interrater agreement. In: Statistical methods for rates and proportions. 2nd ed. New York: John Wiley & Sons; 1981. p. 212–36.

  24. Quan SF, Griswold ME, Iber C, Nieto FJ, Rapoport DM, Redline S et al (2002) Short-term variability of respiration and sleep during unattended nonlaboratory polysomnography–the Sleep Heart Health Study [corrected]. Sleep 8:843–9

    Google Scholar 

  25. Whitney CW, Gottlieb DJ, Redline S, Norman RG, Dodge RR, Shahar E et al (1998) Reliability of scoring respiratory disturbance indices and sleep staging. Sleep 21(7):749–757. https://doi.org/10.1093/sleep/21.7.749

    Article  CAS  PubMed  Google Scholar 

  26. Ding J, Nieto FJ, Beauchamp NJ Jr, Harris TB, Robbins JA, Hetmanski JB et al (2004) Sleep-Disordered Breathing and White Matter Disease in the Brainstem in Older Adults. Sleep 27(3):474–479. https://doi.org/10.1093/sleep/27.3.474

    Article  PubMed  Google Scholar 

  27. Ruehland WR, Churchward TJ, Schachter LM, Lakey T, Tarquinio N, O’Donoghue FJ et al (2015) Polysomnography using abbreviated signal montages: impact on sleep and cortical arousal scoring. Sleep Med 16(1):173–80. https://doi.org/10.1016/j.sleep.2014.11.005

    Article  PubMed  Google Scholar 

  28. Parrino L, Ferri R, Zucconi M, Fanfulla F (2009) Commentary from the Italian Association of Sleep Medicine on the AASM manual for the scoring of sleep and associated events: For debate and discussion. Sleep Med 10(7):799–808. https://doi.org/10.1016/j.sleep.2009.05.009

    Article  PubMed  Google Scholar 

  29. Danker-Hopfe H, Anderer P, Zeitlhofer J, Boeck M, Dorn H, Gruber G et al (2009) Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard. J Sleep Res 18(1):74–84. https://doi.org/10.1111/j.1365-2869.2008.00700.x

    Article  PubMed  Google Scholar 

  30. Fiorillo L, Puiatti A, Papandrea M, Ratti P-L, Favaro P, Roth C et al (2019) Automated sleep scoring: a review of the latest approaches. Sleep Med Rev 48:101204. https://doi.org/10.1016/j.smrv.2019.07.007

    Article  PubMed  Google Scholar 

  31. Stephansen JB, Olesen AN, Olsen M, Ambati A, Leary EB, Moore HE et al (2018) Neural network analysis of sleep stages enables efficient diagnosis of narcolepsy. Nat Commun 9(1):5229. https://doi.org/10.1038/s41467-018-07229-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ruehland WR, O’Donoghue FJ, Pierce RJ, Thornton AT, Singh P, Copland JM et al (2011) The 2007 AASM Recommendations for EEG Electrode Placement in Polysomnography: Impact on Sleep and Cortical Arousal Scoring. Sleep 34(1):73–81. https://doi.org/10.1093/sleep/34.1.73

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the contributions of fifteen laboratories and thirty scorers who participated in this study. The authors also thank Prof David Berlowitz and Assoc. Prof Fergal O’Donoghue for review of the manuscript and Marnie Collins for statistical analysis.

Funding

The study was funded by a grant from the Australasian Sleep Trials Network.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualisation: Peter Rochford, Andrew Thornton; study design: Peter Rochford, Andrew Thornton, Warren Ruehland, Robert Pierce; data collection and manuscript preparation: Warren Ruehland; data analysis, results interpretation and software development: Warren Ruehland, Peter Rochford, Andrew Thornton; scorer workshops and scorer manual development: all authors; manuscript revision and editing: Warren Ruehland, Peter Rochford, Andrew Thornton, Parmjit Singh. All authors except Robert Pierce (deceased) read and approved the final manuscript.

Corresponding author

Correspondence to Warren R. Ruehland.

Ethics declarations

Ethical approval

This study was approved by the Austin Health and Royal Adelaide Hospital Human Research Ethics Committees. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Conflict of interest

Warren Ruehland, Peter Rochford and Andrew Thornton are directors of Respiratory Quality Assurance Pty Ltd, which provides the PSG scoring quality assurance program QSleep. Parmjit Singh provides contracting services for QSleep. Robert Pierce is deceased and is included as an author due to his contribution to this research.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 590 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ruehland, W.R., Rochford, P.D., Pierce, R.J. et al. External proficiency testing improves inter-scorer reliability of polysomnography scoring. Sleep Breath 27, 923–932 (2023). https://doi.org/10.1007/s11325-022-02673-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11325-022-02673-4

Keywords

Navigation