On the choice of measures of reliability and validity in the content-analysis of texts

Oleinik, Anton; Popova, Irina; Kirdina, Svetlana; Shatalova, Tatyana

doi:10.1007/s11135-013-9919-0

On the choice of measures of reliability and validity in the content-analysis of texts

Published: 07 September 2013

Volume 48, pages 2703–2718, (2014)
Cite this article

Quality & Quantity Aims and scope Submit manuscript

Anton Oleinik^1,2,
Irina Popova^3,4,
Svetlana Kirdina⁵ &
…
Tatyana Shatalova⁶

2449 Accesses
25 Citations
3 Altmetric
Explore all metrics

Abstract

The paper discusses several reliability measures: Scott’s pi, Krippendorff’s alpha, free marginal adjustment (Bennett, Alpert and Goldstein’s \(S\)), Cohen’s kappa, and Perreault and Leigh’s \(I\) and the assumptions on which they are based. It is suggested that correlation coefficients between, on one hand, the distribution of qualitative codes and, on the other hand, word co-occurrences and the distribution of the categories identified with the help of the dictionary based on substitution complement the other reliability measures. The paper shows that the choice of the reliability measure depends on the format of the text (stylistic versus rhetorical) and the type of reading (comprehension versus interpretation). Namely, Cohen’s kappa and Bennett, Alpert and Goldstein’s \(S\) emerge as reliability measures particularly suited for perspectival reading of rhetorical texts. Outcomes of the content analysis of 57 texts performed by four coders with the help of computer program QDA Miner inform the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is Qualitative in Qualitative Research

Article Open access 27 February 2019

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Notes

A similar assumption underpins the use of Cronbach’s \(\alpha \) in the cultural consensus theory. The agreement between coders presumably depends on how well they know the content of a cultural domain that exists independently of their input (Weller 2007, p. 343).
This assumption also serves to minimize the influence of the coders’ values on the outcomes of content analysis. If the content analysis is not value-free, then the coders have fewer chances to agree on the distribution of the categories. The possibility theorem, which is applicable to choices guided by values, states that “for any method of deriving social choices by aggregating individual preference patterns which satisfies certain natural conditions, it is possible to find individual preference patterns which give rise to a social choice pattern which is not linear ordering” (Arrow 1950, p. 330).
This rationale should not be confused with another, more “positivist” argument advanced by Krippendorff (2004a, p. 249): “we must estimate the distribution of categories in the population of phenomena from the judgments of as many observers as possible (at least two), making the common assumption that observer differences wash out in their average”.
Being a recent university graduate, the fourth co-author has not produced enough publications yet. She played the role of a “perfect reader” whose take on a text is not affected by the authorship of the other texts included in the sample.
When assessing this level of inter-coder agreement, one has to bear in mind that it reflects both the reliability of unitizing and the reliability of coding.
The dictionary based on substitution was subject to small edits only at the third stage.
The code book for analyzing the first co-author’s texts contained 13 codes, in the case of the second co-author it contained 15 codes, and in the case of the third—nine codes.
The distribution of the reliability measures was visually inspected prior to correlation analysis. This “eyeballing” suggested that the normality of distribution condition was not significantly violated.

References

Arrow, K.J.: A difficulty in the concept of social welfare. J. Polit. Econ. 58(4), 328–346 (1950)
Article Google Scholar
Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistic. Comput. Linguist. 34(4), 555–596 (2008)
Article Google Scholar
Bennett, E., Alpert, R., Goldstein, A.C.: Communications through limited-response questioning. Public Opin. Quart. 18(3), 303–308 (1954)
Article Google Scholar
Bryman, A., Bell, E., Teevan, J.J.: Social Research Methods, 3rd edn. Oxford University Press, Don Mills (2012)
Google Scholar
Camp, S.D., Saylor, W.G., Harer, M.D.: Aggregating individual-level evaluations of the organizational social climate: a multilevel investigation of the work environment at the Federal bureau of prisons. Justice Q. 14(4), 739–762 (1997)
Article Google Scholar
Dijkstra, L., van Eijnatten, F.M.: Agreement and consensus in a Q-mode research design: an empirical comparison of measures, and an application. Qual. Quant. 43(5), 757–771 (2009)
Article Google Scholar
Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007)
Article Google Scholar
Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. SAGE, Thousand Oaks (2004a)
Google Scholar
Krippendorff, K.: Measuring the reliability of qualitative text analysis data. Qual. Quant. 38(6), 787–800 (2004b)
Article Google Scholar
Lotman, Y.: Universe of the Mind: A Semiotic Theory of Culture. Indiana University Press, Bloomington (1990)
Google Scholar
Muñoz-Leiva, F., Montoro-Ríos, F.J., Luque-Martínez, T.: Assessment of interjudge reliability in the open-ended questions coding process. Qual. Quant. 40(4), 519–537 (2006)
Article Google Scholar
Neuendorf, K.A.: The Content Analysis Guidebook. SAGE, Thousand Oaks (2002)
Google Scholar
Norris, S.P., Philips, L.M.: The relevance of a reader’s knowledge within a perspectival view of reading. J. Read. Behav. 26(4), 391–412 (1994)
Google Scholar
Oleinik, A.: Mixing quantitative and qualitative content analysis: triangulation at work. Qual. Quant. 45(4), 859–873 (2010)
Article Google Scholar
Oleinik, A., Kirdina S., Popova I., Shatalova T.: Kak uchenye chitayut drug druga: osnova teorii akademicheskogo chteniya [How scientists read: on a theory of academic reading]. SOCIS 8 (2013)
Perreault, W.D., Leigh, L.E.: Reliability of nominal data based on qualitative judgments. J. Mark. Res. 26(2), 135–148 (1989)
Article Google Scholar
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill Book Co., New York (1983)
Google Scholar
Scott, W.A.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19(3), 321–325 (1955)
Article Google Scholar
Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioural Sciences. McGraw Hill, New York (1988)
Google Scholar
Skinner, Q.: Visions of Politics. Cambridge University Press, Cambridge (2002)
Book Google Scholar
Warner, R.M.: Applied Statistics. SAGE, Thousand Oaks (2008)
Google Scholar
Weller, S.C.: Cultural consensus theory: applications and frequently asked questions. Field Methods 19(4), 339–368 (2007)
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers of Quality & Quantity for their helpful and constructive suggestions and comments. However, all remaining errors and inaccuracies are solely attributable to the authors.

Author information

Authors and Affiliations

Department of Sociology, Memorial University of Newfoundland, St. John’s, NL, A1B 5S7, Canada
Anton Oleinik
Central Economics and Mathematics Institute Russian Academy of Sciences, Moscow, Russia
Anton Oleinik
Institute of Sociology Russian Academy of Sciences, Moscow, Russia
Irina Popova
Higher School of Economics, Moscow, Russia
Irina Popova
Institute of Economics Russian Academy of Sciences, Moscow, Russia
Svetlana Kirdina
Moscow State Lomonossov University, Moscow, Russia
Tatyana Shatalova

Authors

Anton Oleinik
View author publications
You can also search for this author in PubMed Google Scholar
Irina Popova
View author publications
You can also search for this author in PubMed Google Scholar
Svetlana Kirdina
View author publications
You can also search for this author in PubMed Google Scholar
Tatyana Shatalova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anton Oleinik.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oleinik, A., Popova, I., Kirdina, S. et al. On the choice of measures of reliability and validity in the content-analysis of texts. Qual Quant 48, 2703–2718 (2014). https://doi.org/10.1007/s11135-013-9919-0

Download citation

Received: 22 February 2013
Accepted: 05 August 2013
Published: 07 September 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11135-013-9919-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the choice of measures of reliability and validity in the content-analysis of texts

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Literature reviews as independent studies: guidelines for academic practice

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the choice of measures of reliability and validity in the content-analysis of texts

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Literature reviews as independent studies: guidelines for academic practice

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation