Skip to main content

Clinical Agreement in Qualitative Measurements

The Kappa Coefficient in Clinical Research

  • Chapter
  • First Online:
Methods of Clinical Epidemiology

Part of the book series: Springer Series on Epidemiology and Public Health ((SSEH))

  • 4499 Accesses

Abstract

Agreement between raters on a categorical scale is not only a subject of scientific research but also a problem frequently encountered in practice. For example, in psychiatry, the mental illness of a subject may be judged as light, moderate or severe. Inter- and intra-rater agreement is a prerequisite for the scale to be implemented in routine use. Agreement studies are therefore crucial in health, medicine and life sciences. They provide information about the amount of error inherent to any diagnostic, score or measurement (e.g. disease diagnostic or implementation quality of health promotion interventions). The kappa-like coefficients (intraclass kappa, Cohen’s kappa and weighted kappa), usually used to assess agreement between or within raters on a categorical scale, are reviewed in this chapter with emphasis on the interpretation and the properties of these coefficients.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bibliography

  • Byrt T, Bishop J, Carlin J (1993) Bias, prevalence and kappa. J Clin Epidemiol 46:423–429

    Article  PubMed  CAS  Google Scholar 

  • Cicchetti DV, Allison T (1971) A new procedure for assessing reliability of scoring EEG sleep recordings. Am J EEG Technol 11:101–109

    Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46

    Article  Google Scholar 

  • Cohen J (1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70:213–220

    Article  PubMed  CAS  Google Scholar 

  • Fermanian J (1984) Mesure de l’accord entre deux juges. Cas qualitatif. Rev d’Epidemiol St Publ 32:140–147

    CAS  Google Scholar 

  • Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measure of reliability. Educ Psychol Meas 33:613–619

    Article  Google Scholar 

  • Gilmour E, Ellerbrock TV, Koulos JP, Chiasson MA, Williamson J, Kuhn L, Wright TC Jr (1997) Measuring cervical ectopy: direct visual assessment versus computerized planimetry. Am J Obstet Gynecol 176:108–111

    Article  PubMed  CAS  Google Scholar 

  • Kraemer HC (1979) Ramifications of a population model for Îş as a coefficient of reliability. Psychometrika 44:461–472

    Article  Google Scholar 

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    Article  PubMed  CAS  Google Scholar 

  • Schuster C (2004) A note on the interpretation of weighted kappa and its relation to other rater agreement statistics for metric scales. Educ Psychol Meas 64:243–253

    Article  Google Scholar 

  • Scott WA (1955) Reliability of content analysis: the case of nominal scale coding. Public Opin Q 19:321–325

    Article  Google Scholar 

  • Thompson WD, Walter SD (1988) A reappraisal of the kappa coefficient. J Clin Epidemiol 41:949–958

    Article  PubMed  CAS  Google Scholar 

  • Vanbelle S, Albert A (2008) A bootstrap method for comparing correlated kappa coefficients. J Stat Comput Simul 78:1009–1015

    Article  Google Scholar 

  • Vanbelle S, Albert A (2009) A note on the linearly weighted kappa coefficient for ordinal scales. Stat Methodol 6:157–163

    Article  Google Scholar 

  • Vanbelle S, Mutsvari T, Declerck D, Lesaffre E (2012) Hierarchical modeling of agreement. Stat Med 31(28):3667–3680

    Google Scholar 

  • Warrens M (2012) Some paradoxical results for the quadratically weighted kappa. Psychometrika 77:315–323

    Article  Google Scholar 

  • Williamson JM, Lipsitz SR, Manatunga AK (2000) Modeling kappa for measuring dependent categorical agreement data. Biostatistics 1:191–202

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sophie Vanbelle .

Editor information

Editors and Affiliations

Appendix: Variance of the Kappa Coefficients

Appendix: Variance of the Kappa Coefficients

The sample variance of Cohen’s kappa coefficient using the delta method is given by

$$ \operatorname{var}(\hat{\kappa})=\frac{{{p_\mathrm{{o}}}(1-{p_\mathrm{{o}}})}}{{N{{{(1-{p_\mathrm{{e}}})}}^2}}}+\frac{{2({p_\mathrm{{o}}}-1)({C_1}-2{p_\mathrm{{o}}}{p_\mathrm{{e}}})}}{{N{{{(1-{p_\mathrm{{e}}})}}^3}}}+\frac{{{{{({p_\mathrm{{o}}}-1)}}^2}({C_2}-4p_\mathrm{{e}}^2)}}{{N{{{(1-{p_\mathrm{{e}}})}}^4}}} $$
(1.5)

where

$$ {C_1}=\sum\limits_{j=1}^K {p_{jj }}({p_{j. }}+{p_{.j }})\quad \mathrm{and}\quad {{\it C}_2}=\sum\limits_{{\it j}=1}^{\it K} \sum\limits_{{\it k}=1}^{\it K} {\it p_{jk }}{{({\it p_{.j }}+{\it p_{k. }})}^2}. $$

With the additional assumption of no rater bias, the sample variance simplifies to

$$ \operatorname{var}({{\hat{\kappa}}_{\mathrm{I}}})=\frac{1}{{N{{{(1-{C_3})}}^2}}}\left\{ \begin{gathered} \sum\limits_{j=1}^K {p_{jj }}[1-4{{\bar{p}}_j}(1-{{\hat{\kappa}}_{\mathrm{I}}})] \hfill \\ +{{(1-{{\hat{\kappa}}_{\mathrm{I}}})}^2}\sum\limits_{j=1}^K \sum\limits_{k=1}^K {p_{jk }}{{({{\bar{p}}_j}+{{\bar{p}}_k})}^2}-{{[{{\hat{\kappa}}_{\mathrm{I}}}-{C_3}(1-{{\hat{\kappa}}_{\mathrm{I}}})]}^2} \hfill \\ \end{gathered} \right\} $$
(1.6)

where \( {{\bar{p}}_j}=({p_{j. }}+{p_{.j }})/2 \) and \( {C_3}=\sum\nolimits_{j=1}^K {{\bar{p}}_j} \) (\( j,k=1,\cdots, K \)).

The two sided \( (1-\alpha ) \) confidence interval for \( \kappa \) is then determined by \( \hat{\kappa}\pm {Q_z}(1-\alpha /2)\sqrt{{\operatorname{var}(\hat{\kappa})}} \), where \( {Q_z}(1-\alpha /2) \) is the \( \alpha /2 \) upper percentile of the standard Normal distribution.

The sample variance of the weighted kappa coefficient obtained by the delta method is

$$ \operatorname{var}({{\hat{\kappa}}_{\mathrm{w}}})=\frac{1}{{N{{{\left( {1-{p_{\mathrm{ew}}}} \right)}}^4}}}\left\{ \begin{gathered} \sum\limits_{j=1}^K \sum\limits_{k=1}^K {p_{jk }}{{\left[ {{w_{jk }}\left( {1-{p_{\mathrm{ew}}}} \right)-\left( {{{\overline{w}}_{j. }}+{{\overline{w}}_{.k }}} \right)\left( {1-{p_{\mathrm{ow}}}} \right)} \right]}^2} \hfill \\ -{{\left( {{p_{\mathrm{ow}}}{p_{\mathrm{ew}}}-2{p_{\mathrm{ew}}}+{p_{\mathrm{ow}}}} \right)}^2} \hfill \\ \end{gathered} \right\} $$
(1.7)

where \( {{\overline{w}}_{.j }}=\sum\nolimits_{m=1}^K {w_{mj }}{p_{m. }} \) and \( {{\overline{w}}_{k. }}=\sum\nolimits_{s=1}^K {w_{ks }}{p_{.s }} \).

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Vanbelle, S. (2013). Clinical Agreement in Qualitative Measurements. In: Doi, S., Williams, G. (eds) Methods of Clinical Epidemiology. Springer Series on Epidemiology and Public Health. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37131-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37131-8_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37130-1

  • Online ISBN: 978-3-642-37131-8

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics