Abstract
Agreement between raters on a categorical scale is not only a subject of scientific research but also a problem frequently encountered in practice. For example, in psychiatry, the mental illness of a subject may be judged as light, moderate or severe. Inter- and intra-rater agreement is a prerequisite for the scale to be implemented in routine use. Agreement studies are therefore crucial in health, medicine and life sciences. They provide information about the amount of error inherent to any diagnostic, score or measurement (e.g. disease diagnostic or implementation quality of health promotion interventions). The kappa-like coefficients (intraclass kappa, Cohen’s kappa and weighted kappa), usually used to assess agreement between or within raters on a categorical scale, are reviewed in this chapter with emphasis on the interpretation and the properties of these coefficients.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Bibliography
Byrt T, Bishop J, Carlin J (1993) Bias, prevalence and kappa. J Clin Epidemiol 46:423–429
Cicchetti DV, Allison T (1971) A new procedure for assessing reliability of scoring EEG sleep recordings. Am J EEG Technol 11:101–109
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46
Cohen J (1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70:213–220
Fermanian J (1984) Mesure de l’accord entre deux juges. Cas qualitatif. Rev d’Epidemiol St Publ 32:140–147
Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measure of reliability. Educ Psychol Meas 33:613–619
Gilmour E, Ellerbrock TV, Koulos JP, Chiasson MA, Williamson J, Kuhn L, Wright TC Jr (1997) Measuring cervical ectopy: direct visual assessment versus computerized planimetry. Am J Obstet Gynecol 176:108–111
Kraemer HC (1979) Ramifications of a population model for κ as a coefficient of reliability. Psychometrika 44:461–472
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Schuster C (2004) A note on the interpretation of weighted kappa and its relation to other rater agreement statistics for metric scales. Educ Psychol Meas 64:243–253
Scott WA (1955) Reliability of content analysis: the case of nominal scale coding. Public Opin Q 19:321–325
Thompson WD, Walter SD (1988) A reappraisal of the kappa coefficient. J Clin Epidemiol 41:949–958
Vanbelle S, Albert A (2008) A bootstrap method for comparing correlated kappa coefficients. J Stat Comput Simul 78:1009–1015
Vanbelle S, Albert A (2009) A note on the linearly weighted kappa coefficient for ordinal scales. Stat Methodol 6:157–163
Vanbelle S, Mutsvari T, Declerck D, Lesaffre E (2012) Hierarchical modeling of agreement. Stat Med 31(28):3667–3680
Warrens M (2012) Some paradoxical results for the quadratically weighted kappa. Psychometrika 77:315–323
Williamson JM, Lipsitz SR, Manatunga AK (2000) Modeling kappa for measuring dependent categorical agreement data. Biostatistics 1:191–202
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Variance of the Kappa Coefficients
Appendix: Variance of the Kappa Coefficients
The sample variance of Cohen’s kappa coefficient using the delta method is given by
where
With the additional assumption of no rater bias, the sample variance simplifies to
where \( {{\bar{p}}_j}=({p_{j. }}+{p_{.j }})/2 \) and \( {C_3}=\sum\nolimits_{j=1}^K {{\bar{p}}_j} \) (\( j,k=1,\cdots, K \)).
The two sided \( (1-\alpha ) \) confidence interval for \( \kappa \) is then determined by \( \hat{\kappa}\pm {Q_z}(1-\alpha /2)\sqrt{{\operatorname{var}(\hat{\kappa})}} \), where \( {Q_z}(1-\alpha /2) \) is the \( \alpha /2 \) upper percentile of the standard Normal distribution.
The sample variance of the weighted kappa coefficient obtained by the delta method is
where \( {{\overline{w}}_{.j }}=\sum\nolimits_{m=1}^K {w_{mj }}{p_{m. }} \) and \( {{\overline{w}}_{k. }}=\sum\nolimits_{s=1}^K {w_{ks }}{p_{.s }} \).
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vanbelle, S. (2013). Clinical Agreement in Qualitative Measurements. In: Doi, S., Williams, G. (eds) Methods of Clinical Epidemiology. Springer Series on Epidemiology and Public Health. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37131-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-37131-8_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37130-1
Online ISBN: 978-3-642-37131-8
eBook Packages: MedicineMedicine (R0)