Inequalities between multi-rater kappas
- 2k Downloads
The paper presents inequalities between four descriptive statistics that have been used to measure the nominal agreement between two or more raters. Each of the four statistics is a function of the pairwise information. Light’s kappa and Hubert’s kappa are multi-rater versions of Cohen’s kappa. Fleiss’ kappa is a multi-rater extension of Scott’s pi, whereas Randolph’s kappa generalizes Bennett et al. S to multiple raters. While a consistent ordering between the numerical values of these agreement measures has frequently been observed in practice, there is thus far no theoretical proof of a general ordering inequality among these measures. It is proved that Fleiss’ kappa is a lower bound of Hubert’s kappa and Randolph’s kappa, and that Randolph’s kappa is an upper bound of Hubert’s kappa and Light’s kappa if all pairwise agreement tables are weakly marginal symmetric or if all raters assign a certain minimum proportion of the objects to a specified category.
KeywordsNominal agreement Cohen’s kappa Scott’s pi Light’s kappa Hubert’s kappa Fleiss’ kappa Randolph’s kappa Cauchy–Schwarz inequality Arithmetic-harmonic means inequality
Mathematics Subject Classification (2010)62H17 62H20 62P25
The author thanks three anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this paper.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Artstein R, Poesio M (2005) Kappa3 = Alpha (or Beta). NLE Technical Note 05-1, University of EssexGoogle Scholar
- Dou W, Ren Y, Wu Q, Ruan S, Chen Y, Bloyet D, Constans J-M (2007) Fuzzy kappa for the agreement measure of fuzzy classifications. Neurocomputing 70: 726–734Google Scholar
- Heuvelmans APJM, Sanders PF (1993) Beoordelaarsovereenstemming. In: Eggen TJHM, Sanders PF (eds) Psychometrie in de Praktijk. Cito Instituut voor Toestontwikkeling, Arnhem, pp 443–470Google Scholar
- Popping R (1983) Overeenstemmingsmaten voor nominale data. PhD thesis, Rijksuniversiteit Groningen, GroningenGoogle Scholar
- Randolph JJ (2005) Free-marginal multirater kappa (multirater κ free): an alternative to Fleiss’ fixed-Marginal multirater kappa. Paper presented at the Joensuu Learning and Instruction Symposium, Joensuu, FinlandGoogle Scholar
- Warrens MJ (2010b) A formal proof of a paradox associated with Cohen’s kappa. J Classif (in press)Google Scholar