Quality & Quantity

, Volume 48, Issue 5, pp 2703–2718 | Cite as

On the choice of measures of reliability and validity in the content-analysis of texts

  • Anton Oleinik
  • Irina Popova
  • Svetlana Kirdina
  • Tatyana Shatalova


The paper discusses several reliability measures: Scott’s pi, Krippendorff’s alpha, free marginal adjustment (Bennett, Alpert and Goldstein’s \(S\)), Cohen’s kappa, and Perreault and Leigh’s \(I\) and the assumptions on which they are based. It is suggested that correlation coefficients between, on one hand, the distribution of qualitative codes and, on the other hand, word co-occurrences and the distribution of the categories identified with the help of the dictionary based on substitution complement the other reliability measures. The paper shows that the choice of the reliability measure depends on the format of the text (stylistic versus rhetorical) and the type of reading (comprehension versus interpretation). Namely, Cohen’s kappa and Bennett, Alpert and Goldstein’s \(S\) emerge as reliability measures particularly suited for perspectival reading of rhetorical texts. Outcomes of the content analysis of 57 texts performed by four coders with the help of computer program QDA Miner inform the analysis.


Reliability measures Content analysis Correlation analysis Interpretation Comprehension Stylistic texts Rhetorical texts 



The authors would like to thank the anonymous reviewers of Quality & Quantity for their helpful and constructive suggestions and comments. However, all remaining errors and inaccuracies are solely attributable to the authors.


  1. Arrow, K.J.: A difficulty in the concept of social welfare. J. Polit. Econ. 58(4), 328–346 (1950)CrossRefGoogle Scholar
  2. Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistic. Comput. Linguist. 34(4), 555–596 (2008)CrossRefGoogle Scholar
  3. Bennett, E., Alpert, R., Goldstein, A.C.: Communications through limited-response questioning. Public Opin. Quart. 18(3), 303–308 (1954)CrossRefGoogle Scholar
  4. Bryman, A., Bell, E., Teevan, J.J.: Social Research Methods, 3rd edn. Oxford University Press, Don Mills (2012)Google Scholar
  5. Camp, S.D., Saylor, W.G., Harer, M.D.: Aggregating individual-level evaluations of the organizational social climate: a multilevel investigation of the work environment at the Federal bureau of prisons. Justice Q. 14(4), 739–762 (1997)CrossRefGoogle Scholar
  6. Dijkstra, L., van Eijnatten, F.M.: Agreement and consensus in a Q-mode research design: an empirical comparison of measures, and an application. Qual. Quant. 43(5), 757–771 (2009)CrossRefGoogle Scholar
  7. Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007)CrossRefGoogle Scholar
  8. Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. SAGE, Thousand Oaks (2004a)Google Scholar
  9. Krippendorff, K.: Measuring the reliability of qualitative text analysis data. Qual. Quant. 38(6), 787–800 (2004b)CrossRefGoogle Scholar
  10. Lotman, Y.: Universe of the Mind: A Semiotic Theory of Culture. Indiana University Press, Bloomington (1990)Google Scholar
  11. Muñoz-Leiva, F., Montoro-Ríos, F.J., Luque-Martínez, T.: Assessment of interjudge reliability in the open-ended questions coding process. Qual. Quant. 40(4), 519–537 (2006)CrossRefGoogle Scholar
  12. Neuendorf, K.A.: The Content Analysis Guidebook. SAGE, Thousand Oaks (2002)Google Scholar
  13. Norris, S.P., Philips, L.M.: The relevance of a reader’s knowledge within a perspectival view of reading. J. Read. Behav. 26(4), 391–412 (1994)Google Scholar
  14. Oleinik, A.: Mixing quantitative and qualitative content analysis: triangulation at work. Qual. Quant. 45(4), 859–873 (2010)CrossRefGoogle Scholar
  15. Oleinik, A., Kirdina S., Popova I., Shatalova T.: Kak uchenye chitayut drug druga: osnova teorii akademicheskogo chteniya [How scientists read: on a theory of academic reading]. SOCIS 8 (2013)Google Scholar
  16. Perreault, W.D., Leigh, L.E.: Reliability of nominal data based on qualitative judgments. J. Mark. Res. 26(2), 135–148 (1989)CrossRefGoogle Scholar
  17. Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill Book Co., New York (1983)Google Scholar
  18. Scott, W.A.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19(3), 321–325 (1955)CrossRefGoogle Scholar
  19. Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioural Sciences. McGraw Hill, New York (1988)Google Scholar
  20. Skinner, Q.: Visions of Politics. Cambridge University Press, Cambridge (2002)CrossRefGoogle Scholar
  21. Warner, R.M.: Applied Statistics. SAGE, Thousand Oaks (2008)Google Scholar
  22. Weller, S.C.: Cultural consensus theory: applications and frequently asked questions. Field Methods 19(4), 339–368 (2007)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Anton Oleinik
    • 1
    • 2
  • Irina Popova
    • 3
    • 4
  • Svetlana Kirdina
    • 5
  • Tatyana Shatalova
    • 6
  1. 1.Department of SociologyMemorial University of NewfoundlandSt. John’sCanada
  2. 2.Central Economics and Mathematics Institute Russian Academy of SciencesMoscowRussia
  3. 3.Institute of Sociology Russian Academy of SciencesMoscowRussia
  4. 4.Higher School of EconomicsMoscowRussia
  5. 5.Institute of Economics Russian Academy of SciencesMoscowRussia
  6. 6.Moscow State Lomonossov UniversityMoscowRussia

Personalised recommendations