Skip to main content
Log in

Association between leniency of anesthesiologists when evaluating certified registered nurse anesthetists and when evaluating didactic lectures

  • Published:
Health Care Management Science Aims and scope Submit manuscript

Abstract

Daily evaluations of certified registered nurse anesthetists’ (CRNAs’) work habits by anesthesiologists should be adjusted for rater leniency. The current study tested the hypothesis that there is a pairwise association by rater between leniencies of evaluations of CRNAs’ daily work habits and of didactic lectures. The historical cohorts were anesthesiologists’ evaluations over 53 months of CRNAs’ daily work habits and 65 months of didactic lectures by visiting professors and faculty. The binary endpoints were the Likert scale scores for all 6 and 10 items, respectively, equaling the maximums of 5 for all items, or not. Mixed effects logistic regression estimated the odds of each ratee performing above or below average adjusted for rater leniency. Bivariate errors in variables least squares linear regression estimated the association between the leniency of the anesthesiologists’ evaluations of work habits and didactic lectures. There were 29/107 (27%) raters who were more severe in their evaluations of CRNAs’ work habits than other anesthesiologists (two-sided P < 0.01); 34/107 (32%) raters were more lenient. When evaluating lectures, 3/81 (4%) raters were more severe and 8/81 (10%) more lenient. Among the 67 anesthesiologists rating both, leniency (or severity) for work habits was not associated with that for lectures (P = 0.90, unitless slope between logits 0.02, 95% confidence interval −0.34 to 0.30). Rater leniency is of large magnitude when making daily clinical evaluations, even when using a valid and psychometrically reliable instrument. Rater leniency was context dependent, not solely a reflection of raters’ personality or rating style.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. For CRNAs’ daily work habits, covariates not significantly associated with their scores included the number of times the ratee worked with the rater, the number of times the ratee was evaluated by the rater, percent time ratee worked with rater and rater completed evaluation, number of cases started by ratee, days worked by ratee, ratio of cases started to days worked by the ratee, intraoperative hours divided by patient care days, percent cases with patient age < 13 years, percent cases evenings or weekends, percent cases with patient’s American Society of Anesthesiologists’ Physical Status >3, percent with American Society of Anesthesiologists’ base units >8, percent cases with break or handoff, percent cases at the hospital (main) surgical suite, and percent cases at the ambulatory surgery center [16].

  2. For context, in 2017, 81 anesthesiologists had mean (SD) 12.6 (8.5) years of postgraduate practice, 8.1 (9.3) years at the department [22]. The 62 CRNAs had 9.0 (8.5) years of practice, 6.5 (5.6) years at the department [22].

  3. The sorting assures the minimum change, because, as explained below in the subsection “Analyses of work habit scores,” analyses are binary.

  4. This calculation was performed using StatXact 11.1, Cytel, Cambridge, MA.

  5. Using 0.01 < P < 0.05, there were 3/107 (2.8%) ratees below average and 7/107 (6.5%) above average. When calculations were repeated without the first two years entered as a binary fixed effect, there were still 35/107 (32.7%) of ratees who were significantly (P < 0.01) below or above average.

  6. With the robust, clustered standard errors, there were 63 raters significantly different than others at P < 0.01, where 63 = 29 + 34. Using asymptotic standard errors, there were 69 raters with P < 0.01. This was the expected and desired result of using robust standard errors. The implication is that use of the robust estimators did not incorrectly result in markedly underestimated standard errors.

  7. For the current study, for each rater with every score equaling the maximum, we changed the score from 5.00 to 4.83, where 4.83 = (5-items × 5-points + 1-item × 4-points) / (5-items). We did so because we were investigating the raters. For routine use when evaluating ratees, there would be no reason (that we are aware) to change the scores. The raters’ scores would be removed fully.

References

  1. Hamilton TE (2004) Centers for Medicare & Medicaid Services (CMS) requirements for hospital medical staff privileging. S&C-05-04. Centers for Medicare & Medicaid Services. https://www.cms.gov/Medicare/Provider-Enrollment-and-Certification/SurveyCertificationGenInfo/Downloads/SCletter05-04.pdf. Accessed May 14, 2020

  2. The Joint Commission (2011) Standards BoosterPak™ for focused professional practice evaluation/ ongoing professional practice evaluation (FPPE/OPPE). Oakbrook Terrace, Illinois

    Google Scholar 

  3. Wikipedia (2020) High-stakes testing. https://en.wikipedia.org/wiki/High-stakes_testing. Accessed 14 May 2020

  4. Dexter F, Bayman EO, Wong CA, Hindman BJ (2020) Reliability of ranking anesthesiologists and nurse anesthetists using leniency-adjusted clinical supervision and work habits scores. J Clin Anesth 61:109639

    Article  Google Scholar 

  5. Ehrenfeld JM, Henneman JP, Peterfreund RA, Sheehan TD, Xue F, Spring S, Sandberg WS (2012) Ongoing professional performance evaluation (OPPE) using automatically captured electronic anesthesia data. Jt Comm J Qual Patient Saf 38:73–80

    Google Scholar 

  6. Bayman EO, Dexter F, Todd MM (2015) Assessing and comparing anesthesiologists’ performance on mandated metrics using a Bayesian approach. Anesthesiology 123:101–115

    Article  Google Scholar 

  7. Bayman EO, Dexter F, Todd MM (2016) Prolonged operative time to extubation is not a useful metric for comparing the performance of individual anesthesia providers. Anesthesiology 124:322–338

    Article  Google Scholar 

  8. Dexter F, Hindman BJ (2016) Do not use hierarchical logistic regression models with low incidence outcome data to compare anesthesiologists in your department. Anesthesiology 125:1083–1084

    Article  Google Scholar 

  9. Epstein RH, Dexter F, Schwenk ES (2017) Hypotension during induction of anaesthesia is neither a reliable nor useful quality measure for comparison of anaesthetists’ performance. Br J Anaesth 119:106–114

    Article  Google Scholar 

  10. Dexter F, Masursky D, Szeluga D, Hindman BJ (2016) Work habits are valid component of evaluations of anesthesia residents based on faculty anesthesiologists’ daily written comments about residents. Anesth Analg 122:1625–1633

    Article  Google Scholar 

  11. Dexter F, Ledolter J, Hindman BJ (2014) Bernoulli cumulative sum (CUSUM) control charts for monitoring of anesthesiologists’ performance in supervising anesthesia residents and nurse anesthetists. Anesth Analg 119:679–685

    Article  Google Scholar 

  12. Bayman EO, Dexter F, Ledolter J (2017) Mixed effects logistic regression modeling of daily evaluations of nurse anesthetists’ work habits adjusting for leniency of the rating anesthesiologists. PCORM 6:14–19

    Google Scholar 

  13. Dexter F, Ledolter J, Hindman BJ (2017) Measurement of faculty anesthesiologists’ quality of clinical supervision has greater reliability when controlling for the leniency of the rating anesthesia resident: a retrospective cohort study. Can J Anesth 64:643–655

    Article  Google Scholar 

  14. Dexter F, Ledolter J, Smith TC, Griffiths D, Hindman BJ (2014) Influence of provider type (nurse anesthetist or resident physician), staff assignments, and other covariates on daily evaluations of anesthesiologists' quality of supervision. Anesth Analg 119:670–678

    Article  Google Scholar 

  15. Dexter F, Ledolter J, Epstein R, Hindman BJ (2017) Operating room anesthesia subspecialization is not associated with significantly greater quality of supervision of anesthesia residents and nurse anesthetists. Anesth Analg 124:1253–1260

    Article  Google Scholar 

  16. Dexter F, Ledolter J, Hindman BJ (2017) Validity of using a work habits scale for the daily evaluation of nurse anesthetists’ clinical performance while controlling for the leniencies of the rating anesthesiologists. J Clin Anesth 42:63–68

    Article  Google Scholar 

  17. Logvinov II, Dexter F, Hindman BJ, Brull SD (2017) Anesthesiologists’ perceptions of minimum acceptable work habits of nurse anesthetists. J Clin Anesth 38:107–110

  18. Bernardin HJ, Cooke DK, Villanova P (2000) Conscientiousness and agreeableness as predictors of rating leniency. J Appl Psychol 85:232–236

    Article  Google Scholar 

  19. Spence JR, Keeping LM (2010) The impact of non-performance information on ratings of job performance: A policy-capturing approach. J Organ Behav 31:587–608

    Article  Google Scholar 

  20. Dewberry C, Davies-Muir A, Newell S (2013) Impact and causes of rater severity/leniency in appraisals without postevaluation communication between raters and ratees. Int J Sel Assess 21:286–293

    Article  Google Scholar 

  21. Dannefer EF, Henson LC, Bierer SB, Grady-Weliky TA, Meldrum S, Nofziger AC, Barclay C, Epstein RM (2005) Peer assessment of professional competence. Med Educ 39:713–722

    Article  Google Scholar 

  22. O’Brien MK, Dexter F, Kreiter CD, Slater-Scott C, Hindman BJ (2019) Nurse anesthetists’ evaluations of anesthesiologists’ operating room performance are sensitive to anesthesiologists’ years of postgraduate practice. J Clin Anesth 54:102–110

    Article  Google Scholar 

  23. University of Iowa Carver College of Medicine (2007) Peer evaluation of teaching. https://www.medicine.uiowa.edu/facultyaffairs/sites/medicine.uiowa.edu.facultyaffairs/files/wysiwyg_uploads/PeerTeachingEvaluation.pdf. Accessed May 14, 2020

  24. melogit — Multilevel mixed-effects logistic regression. https://www.stata.com/manuals13/memelogit.pdf. Accessed May 14, 2020

  25. Sribney B (2020) Advantages of the robust variance estimator. Stata. https://www.stata.com/support/faqs/statistics/robust-variance-estimator/. Accessed 14 May 2020

  26. Nichols A, Schaffer M (2007) Clustered errors in Stata. Stata. https://www.stata.com/meeting/13uk/nichols_crse.pdf. Accessed 14 May 2020

  27. Glance LG, Dick AW (2016) In response. Anesth Analg 122:1722–1727

    Article  Google Scholar 

  28. Robust and clustered standard errors. https://www.stata.com/manuals/semintro8.pdf. Accessed May 14, 2020

  29. York D (1969) Least squares fitting of a straight line with correlated errors. Earth Planet Sci Lett 5:320–324

    Article  Google Scholar 

  30. Williamson JH (1968) Least-squares fitting of a straight line. Can J Phys 46:1845–1847

    Article  Google Scholar 

  31. Cantrell CA (2008) Review of methods for linear least-squares fitting of data and application to atmospheric chemistry problems. Atmos Chem Phys 8:5744–5487

    Article  Google Scholar 

  32. Tellinghuisen J (2010) Least-squares analysis of data with uncertainty in x and y: A Monte Carlo methods comparison. Chemom Intell Lab Syst 103:160–169

    Article  Google Scholar 

  33. Dexter F, Hadlandsmyth K, Pearson ACS, Hindman BJ (2020) Reliability and validity of performance evaluations of pain medicine clinical faculty by residents and fellows using a supervision scale. Anesth Analg https://doi.org/10.1213/ANE.0000000000004779

  34. Webb NM, Shavelson RJ, Haertel EH (2006) 4 reliability coefficients and generalizability theory. Handbook of Statistics 26:81–124

    Article  Google Scholar 

  35. Jeon Y, Meretoja R, Vahlberg T, Leino-Kilpi H (2020) Developing and psychometric testing of the anaesthesia nursing competence scale. J Eval Clin Pract 26:866–878

  36. Müller T, Montano D, Poinstingl H, Dreiling K, Schiekirka-Schwake S, Anders S, Raupach T, von Steinbüchel N (2017) Evaluation of large-group lectures in medicine - development of the SETMED-L (Student Evaluation of Teaching in MEDical Lectures) questionnaire. BMC Med Educ 17:137

    Article  Google Scholar 

  37. Perella P, Palmer E, Conway R, Wong DJN (2019) A retrospective analysis of case-load and supervision from a large anaesthetic logbook database. Anaesthesia 74:1524–1533

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Franklin Dexter.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

ESM 1

(PDF 99 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dexter, F., Ledolter, J., Wong, C.A. et al. Association between leniency of anesthesiologists when evaluating certified registered nurse anesthetists and when evaluating didactic lectures. Health Care Manag Sci 23, 640–648 (2020). https://doi.org/10.1007/s10729-020-09518-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10729-020-09518-0

Keywords

Navigation