Variance Estimation of NominalScale InterRater Reliability with Random Selection of Raters
 Kilem Li Gwet
Most interrater reliability studies using nominal scales suggest the existence of two populations of inference: the population of subjects (collection of objects or persons to be rated) and that of raters. Consequently, the sampling variance of the interrater reliability coefficient can be seen as a result of the combined effect of the sampling of subjects and raters. However, all interrater reliability variance estimators proposed in the literature only account for the subject sampling variability, ignoring the extra sampling variance due to the sampling of raters, even though the latter may be the biggest of the variance components. Such variance estimators make statistical inference possible only to the subject universe. This paper proposes variance estimators that will make it possible to infer to both universes of subjects and raters. The consistency of these variance estimators is proved as well as their validity for confidence interval construction. These results are applicable only to fully crossed designs where each rater must rate each subject. A small Monte Carlo simulation study is presented to demonstrate the accuracy of largesample approximations on reasonably small samples.
Psychometrika
Volume 73, Issue 3 , pp 407430
 20080901
 10.1007/s1133600790548
 00333123
 18600980
 SpringerVerlag
 interrater reliability
 AC 1 coefficient
 kappa statistic
 agreement coefficient
 Kilem Li Gwet ^{(1)}
 1. STATAXIS Consulting, Sr. Statistical Consultant, 20315 Marketree Place, Montgomery Village, MD, 20886, USA