Behavior Research Methods

, Volume 47, Issue 3, pp 837–847 | Cite as

EasyDIAg: A tool for easy determination of interrater agreement

  • Henning HolleEmail author
  • Robert Rein


Reliable measurements are fundamental for the empirical sciences. In observational research, measurements often consist of observers categorizing behavior into nominal-scaled units. Since the categorization is the outcome of a complex judgment process, it is important to evaluate the extent to which these judgments are reproducible, by having multiple observers independently rate the same behavior. A challenge in determining interrater agreement for timed-event sequential data is to develop clear objective criteria to determine whether two raters’ judgments relate to the same event (the linking problem). Furthermore, many studies presently report only raw agreement indices, without considering the degree to which agreement can occur by chance alone. Here, we present a novel, free, and open-source toolbox (EasyDIAg) designed to assist researchers with the linking problem, while also providing chance-corrected estimates of interrater agreement. Additional tools are included to facilitate the development of coding schemes and rater training.


Cohen’s kappa Toolbox Coding Annotation Rater Agreement 



We are grateful to Hedda Lausberg for sharing the annotation data. The acquisition of this data set was supported by a grant awarded to H.L. from the DFG (LA 1249/2-1). The EasyDIAg toolbox described in the article can be downloaded free of charge from

Supplementary material

13428_2014_506_MOESM1_ESM.pdf (25 kb)
ESM 1 (PDF 24 kb)


  1. Bakeman, R., & Quera, V. (1992). SDIS: A sequential data interchange standard. Behavior Research Methods, Instruments, & Computers, 24, 554–559. doi: 10.3758/BF03203604 CrossRefGoogle Scholar
  2. Bakeman, R., & Quera, V. (2011). Sequential analysis and observational methods for the behavioral sciences. New York: Cambridge University Press.CrossRefGoogle Scholar
  3. Bakeman, R., Quera, V., & Gnisci, A. (2009). Observer agreement for timed-event sequential data: A comparison of time-based and event-based algorithms. Behavior Research Methods, 41, 137–147. doi: 10.3758/brm.41.1.137 PubMedCentralCrossRefPubMedGoogle Scholar
  4. Bakeman, R., & Robinson, B. F. (1994). Understanding log-linear analysis with ILOG: An interactive approach. Hillsdale, NJ: Erlbaum.Google Scholar
  5. Bavelas, J. B., Gerwing, J., Sutton, C., & Prevost, D. (2008). Gesturing on the telephone: Independent effects of dialogue and visibility. Journal of Memory and Language, 58, 495–520.CrossRefGoogle Scholar
  6. Bavelas, J. B., Kenwood, C., & Phillips, B. (2002). Discourse analysis. In M. Knapp & M. Daly (Eds.), Handbook of interpersonal communication (3rd ed., pp. 102–129). Thousand Oaks, CA: Sage.Google Scholar
  7. Cicchetti, D. V., & Feinstein, A. R. (1990). High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology, 43, 551–558.CrossRefPubMedGoogle Scholar
  8. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.CrossRefGoogle Scholar
  9. Deming, W. E., & Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics, 11, 427–444. doi: 10.2307/2235722 CrossRefGoogle Scholar
  10. Dijkstra, W., & Taris, T. (1995). Measuring the agreement between sequences. Sociological Methods & Research, 24, 214–231. doi: 10.1177/0049124195024002004 CrossRefGoogle Scholar
  11. Haccou, P., & Meelis, E. (1994). Statistical analysis of behavioural data: An approach based on time-structured models. Oxford: Oxford University Press.Google Scholar
  12. Holle, H., & Rein, R. (2013). The modified Cohen’s kappa: Calculating interrater agreement for segmentation and annotation. In H. Lausberg (Ed.), Understanding body movement: A guide to empirical research on nonverbal behaviour (With an introduction to the NEUROGES coding system, pp. 261–275). Frankfurt am Main: Peter Lang Verlag.Google Scholar
  13. Jansen, R. G., Wiertz, L. F., Meyer, E. S., & Noldus, L. P. (2003). Reliability analysis of observational data: Problems, solutions, and software implementation. Behavior Research Methods, Instruments, and Computers, 35, 391–399.CrossRefPubMedGoogle Scholar
  14. Kaufman, A. B., & Rosenthal, R. (2009). Can you believe my eyes? The importance of interobserver reliability statistics in observations of animal behaviour. Animal Behaviour, 78, 1487–1491. doi: 10.1016/j.anbehav.2009.09.014 CrossRefGoogle Scholar
  15. Kaufmann, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.CrossRefGoogle Scholar
  16. Lausberg, H. (2013). Understanding body movement: A guide to empirical research on nonverbal behaviour (With an introduction to the NEUROGES coding system). Frankfurt am Main: Peter Lang Verlag.Google Scholar
  17. Lausberg, H., & Sloetjes, H. (2009). Coding gestural behavior with the NEUROGES–ELAN system. Behavior Research Methods, 41, 841–849. doi: 10.3758/brm.41.3.841 CrossRefPubMedGoogle Scholar
  18. Mackinnon, A. (2000). A spreadsheet for the calculation of comprehensive statistics for the assessment of diagnostic tests and inter-rater agreement. Computers in Biology and Medicine, 30, 127–134.CrossRefPubMedGoogle Scholar
  19. McNeill, D. (1992). Hand and mind:What gestures reveal about thought. Chicago: University of Chicago Press.Google Scholar
  20. Quera, V., Bakeman, R., & Gnisci, A. (2007). Observer agreement for event sequences: Methods and software for sequence alignment and reliability estimates. Behavior Research Methods, 39, 39–49.CrossRefPubMedGoogle Scholar
  21. Rein, R. (2013). Using 3D kinematics of hand segments for segmentation of gestures: A pilot study. In H. Lausberg (Ed.), Understanding body movement: A guide to empirical research on nonverbal behavior (With an introduction to the NEUROGES coding system, pp. 163–187). Frankfurt am Main: Peter Lang Verlag.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2014

Authors and Affiliations

  1. 1.Department of PsychologyUniversity of HullHullUK
  2. 2.Institute of Health Promotion and Clinical Movement Science, Department of Neurology, Psychosomatic Medicine, and PsychiatryGerman Sport UniversityCologneGermany

Personalised recommendations