Abstract
Content analysis of traditional and social media has a central role in investigating features of media content, measuring media exposure, and calculation of media effects. The reliability of content coding is usually evaluated using “kappa-centric” agreement measures, but these measures produce results that aggregate individual coder decisions which obscure the performance of individual coders. Using a data set of 105 advertisements for sports and energy drinks media content coded by five coders, we demonstrate that Item Response Theory can track coder performance over time and give coder-specific information on the consistency of decisions over qualitatively coded objects. We conclude that IRT should be added to content analysts’ tool kit of useful methodologies to track and evaluate content coders’ performance.
Similar content being viewed by others
Notes
Because IRT results are factor analysis derived, extensions of the IRT model for measuring additional aspects of “coder agreement” are possible (Dayton 2008; Porcu and Giambona 2017; Uebersax 1992). However, this would entail – at least in Stata – the use of its SEM procedures and a comprehensive knowledge of confirmatory factor analysis.
Three parameter models are used to model guessing in analysis of knowledge and cognitive ability test items but do not apply here because trained coders should never be guessing.
That the IRT approach is more efficient given many coding decisions is obvious. We use only 4 examples here, but the TeenADE data set has over 100 coded ad features and with 5 coders would require at least 1000 kappa-centric measures to evaluate all coding decisions, i.e., our Table 2 with a 100 columns of ten kappa-centric entries: no one can examine such a table sufficiently carefully.
References
Aho, K., Derryberry, D., Peterson, T.: Model selection for ecologists: the worldviews of AIC and BIC. Ecology 95(3), 631–636 (2014)
Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguistics 34(4), 555–596 (2008)
Banerjee, M., Capozzoli, M., McSweeney, L., Sinha, D.: Beyond kappa: A review of interrater agreement measures. Can. J. Stat. 27(1), 3–23 (1999)
Barker, A.B., Whittamore, K., Britton, J., Murray, R.L., Cranwell, J.: A content analysis of alcohol content in UK television. J. Public Health, fdy142–fdy142 (2018). doi:https://doi.org/10.1093/pubmed/fdy142
Barnhart, H.X., Haber, M.J., Lin, L.I.: An overview on assessing agreement with continuous measurements. J. Biopharm. Stat. 17(4), 529–569 (2007). doi:https://doi.org/10.1080/10543400701376480
Belur, J., Tompson, L., Thornton, A., Simon, M.: Interrater reliability in systematic review methodology: exploring variation in coder decision-making. Sociol. Methods Res., 1–29 (2018). doi:https://doi.org/10.1177/0049124118799372
Beullens, K., Schepers, A.: Display of Alcohol Use on Facebook: A Content Analysis. CyberPsychology Behav. Social Netw., 16(7), (2013). doi:https://doi.org/10.1089/cyber.2013.0044
Bleakley, A., Fishbein, M., Hennessy, M., Jordan, A., Chernin, A., Stevens, R.: Developing respondent based multi-media measures of exposure to sexual content. Commun. Methods Measures 2(1 & 2), 43–64 (2008)
Bleakley, A., Ellithorpe, M.E., Hennessy, M., Jamieson, P.E., Khurana, A., Weitz, I.: Risky movies, risky behaviors, and ethnic identity among Black adolescents. Soc. Sci. Med. 195, 131–137 (2017). doi:https://doi.org/10.1016/j.socscimed.2017.10.024
Brennan, R.L., Prediger, D.J.: Coefficient kappa: Some uses, misuses, and alternatives. Educ. Psychol. Meas. 41(3), 687–699 (1981)
Brown, T.: Confirmatory Factor Analysis for Applied Research, 2nd edn. Guilford, New York (2015)
Brownbill, A.L., Miller, C.L., Smithers, L.G., Braunack-Mayer, A.J.: Selling function: the advertising of sugar-containing beverages on Australian television. Health Promot. Int. (2020). doi:https://doi.org/10.1093/heapro/daaa052
Buchanan, L., Yeatman, H., Kelly, B., Kariippanon, K.: A thematic content analysis of how marketers promote energy drinks on digital platforms to young Australians. Aust. N. Z. J. Public Health. 42(6), 530–531 (2018). doi:https://doi.org/10.1111/1753-6405.12840
Burke, L.M., Hawley, J.A.: Swifter, higher, stronger: What’s on the menu? Science. 362(6416), 781–787 (2018). doi:https://doi.org/10.1126/science.aau2093
Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. J. Clin. Epidemiol. 46(5), 423–429 (1993)
Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguistics 22(2), 249–254 (1996)
Cavazos-Rehg, P.A., Krauss, M., Fisher, S.L., Salyer, P., Grucza, R.A., Bierut, L.J.: Twitter chatter about marijuana. J. Adolesc. Health 56(2), 139–145 (2015a)
Cavazos-Rehg, P.A., Krauss, M.J., Sowles, S.J., Bierut, L.J.: “Hey everyone, I’m drunk.” An evaluation of drinking-related Twitter chatter. J. Stud. Alcohol Drug 76(4), 635–643 (2015b)
Coates, A.E., Hardman, C.A., Halford, J.C.G., Christiansen, P., Boyland, E.J.: Food and Beverage Cues Featured in YouTube Videos of Social Media Influencers Popular With Children: An Exploratory Study. Front. Psychol., 10(2142), (2019). doi:https://doi.org/10.3389/fpsyg.2019.02142
Coleman, R., Hatley Major, L.: Ethical health communication: A content analysis of predominant frames and primes in public service announcements. J. Mass Media Ethics. 29(2), 91–107 (2014). doi:https://doi.org/10.1080/08900523.2014.893773
Dayton, C.M.: An introduction to latent class analysis. In: Menard, S. (ed.) Handbook of longitudinal research: Design, measurement, and analysis, pp. 357–371. Academic Press (2008)
DeJong, R.C.W., Bryn Austin, S., William: US federally funded television public service announcements (PSAs) to prevent HIV/AIDS: A content analysis. J. Health Communication. 6(3), 249–263 (2001). doi:https://doi.org/10.1080/108107301752384433
El-Khoury, J., Bilani, N., Abu-Mohammad, A., Ghazzaoui, R., Kassir, G., Rachid, E., Hayek, E., S: Drugs and Alcohol Themes in Recent Feature Films: A Content Analysis. J. Child Adolesc. Subst. Abuse 28(1), 8–14 (2019)
Emmers-Sommer, T.M., Allen, M.: Surveying the effect of media effects: A meta-analytic summary of the media effects research in Human Communication Research. Hum. Commun. Res. 25(4), 478–497 (1999)
Feinstein, A.R., Cicchetti, D.V.: High agreement but low kappa: I. The problems of two paradoxes. J. Clin. Epidemiol. 43(6), 543–549 (1990)
Fishbein, M., Ajzen, I.: Predicting and changing behavior: The reasoned action approach. Taylor & Francis (2010)
Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Meas. 33(3), 613–619 (1973)
Garrison, D.R., Cleveland-Innes, M., Koole, M., Kappelman, J.: Revisiting methodological issues in transcript analysis: Negotiated coding and reliability. The Internet and Higher Education 9(1), 1–8 (2006)
Glockner-Rist, A., Hoijtink, H.: The best of both worlds: Factor analysis of dichotomous data using item response theory and structural equation modeling. Struct. Equ. Model. 10(4), 544–565 (2003)
Gwet, K.L.: Inter-rater reliability: dependency on trait prevalence and marginal homogeneity. Stat. Methods Inter-Rater Reliab. Assess. Ser. 2(1), 1–9 (2002)
Gwet, K.L.: Computing inter-rater reliability and its variance in the presence of high agreement. Br. J. Math. Stat. Psychol. 61(1), 29–48 (2008)
Gwet, K.L.: Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters, vol. 2. Advanced Analytics, LLC (2014a)
Gwet, K.L.: Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters (vol. 1: Analysis of Categorical Ratings): Advanced Analytics, LLC (2014b)
Hallgren, K.A.: Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology. 8(1), 23–34 (2012). doi:https://doi.org/10.20982/tqmp.08.1.p023
Harris, J.L., Felming-Milici, F., Kibwana-Jaff, A., Phaneuf, L.: Sugary drink advertising to you: Continued barrier to public health progress. University of Connecticut Rudd Center for Food Policy and Obesity, Storrs (2020)
Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1(1), 77–89 (2007)
Hennessy, M., Bleakley, A., Piotrowski, J.T., Mallya, G., Jordan, A.: Sugar-sweetened beverage consumption by adult caregivers and their children: the role of drink features and advertising exposure. Health Educ. Behav. 42(5), 677–686 (2015)
Hennessy, M., Bleakley, A., Ellithorpe, M.E., Maloney, E., Jordan, A.B., Stevens, R.: Reducing Unhealthy Normative Behavior: The Case of Sports and Energy Drinks. Health Educ. Behav., 1–12 (2021). doi:https://doi.org/10.1177/10901981211055468
Jordan, A., Kunkel, D., Manganello, J., Fishbein, M. (eds.): Media messages and public health: A decisions approach to content analysis. Routledge (2010)
Krauss, M., Grucza, R., Bierut, L., Cavazos-Rehg, P.: “Get drunk. Smoke weed. Have fun.”: A content analysis of tweets about marijuana and alcohol. Am. J. Health Promotion 31(3), 200–208 (2017)
Krippendorff, K.: Bivariate agreement coefficients for reliability of data. Sociol. Methodol. 2, 139–150 (1970)
Krippendorff, K.: Reliability in content analysis: Some common misconceptions and recommendations. Hum. Commun. Res. 30(3), 411–433 (2004)
Krippendorff, K.: Content analysis: An introduction to its methodology. Sage publications (2018)
Lacy, S., Watson, B.R., Riffe, D., Lovejoy, J.: Issues and best practices in content analysis. Journalism & Mass Communication Quarterly 92(4), 791–811 (2015)
Marriott, B.P., Hunt, K.J., Malek, A.M., Newman, J.C.: Trends in intake of energy and total sugar from sugar-sweetened beverages in the United States among children and adults, NHANES 2003–2016. Nutrients, 11(9), (2019)
Moran, A.J., Roberto, C.A.: Health warning labels correct parents’ misperceptions about sugary drink options. Am. J. Prev. Med. 55(2), e19–e27 (2018)
Munsell, C.R., Harris, J.L., Sarda, V., Schwartz, M.B.: Parents’ beliefs about the healthfulness of sugary drink options: opportunities to address misperceptions. Public Health. Nutr. 19(1), 46–54 (2016)
Mus, S., Rozas, L., Barnoya, J., Busse, P.: Gender representation in food and beverage print advertisements found in corner stores around schools in Peru and Guatemala. BMC Res. Notes. 14(1), 402 (2021). doi:https://doi.org/10.1186/s13104-021-05812-4
Neuendorf, K.A.: The content analysis guidebook. Sage, Thousand Oaks (2017)
O’Keefe, D.J.: Elaboration Likelihood Model. In: Donsbach, W. (ed.) The international encyclopedia of communication (Vol, IV, pp. 1475–1480. Blackwell, Oxford (2008)
Oleinik, A., Popova, I., Kirdina, S., Shatalova, T.: On the choice of measures of reliability and validity in the content-analysis of texts. Qual. Quant. 48(5), 2703–2718 (2014)
Peteet, B., Roundtree, C., Dixon, S., Mosley, C., Miller-Roenigk, B., White, J.,. . McCuistian, C.: ‘Codeine crazy:’a content analysis of prescription drug references in popular music. J. Youth Stud., 1–17 (2020). doi:https://doi.org/10.1080/13676261.2020.1801992
Petty, R.E., Cacioppo, J.T.: Communication and persuasion: Central and peripheral routes to attitude change. Springer-Verlag, New York (1986)
Porcu, M., Giambona, F.: Introduction to latent class analysis with applications. J. Early Adolescence. 37(1), 129–158 (2017). doi:https://doi.org/10.1177/0272431616648452
Potter, W.J., Riddle, K.: A content analysis of the media effects literature. Journalism & Mass Communication Quarterly 84(1), 90–104 (2007)
Primack, B.A., Dalton, M.A., Carroll, M.V., Agarwal, A.A., Fine, M.J.: Content analysis of tobacco, alcohol, and other drugs in popular music. Arch. Pediatr. Adolesc. Med. 162(2), 169–175 (2008)
Raykov, T., Marcoulides, G.A.: A Course in Item Response Theory and Modeling with Stata. Stata Press, College Station (2018)
Reise, S., Ainsworth, A., Haviland, M.: Item response theory: Fundamentals, applications, and promise in psychological research. Curr. Dir. Psychol. Sci. 14(2), 95–101 (2005)
Riff, D., Lacy, S., Watson, B., Fico, F.: Analyzing media messages: Using quantitative content analysis in research, Fourth edn. Routledge, New York (2019)
Russell, C.A., Russell, D.W., Grube, J.W.: Nature and impact of alcohol messages in a youth-oriented television series. J. Advertising. 38(3), 97–112 (2009). doi:https://doi.org/10.2753/JOA0091-3367380307
Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86(2), 420–428 (1979)
Singh, J.: Tackling measurement problems with Item Response Theory: Principles, characteristics, and assessment, with an illustrative example. J. Bus. Res. 57(2), 184–208 (2004). doi:https://doi.org/10.1016/S0148-2963(01)00302-2
Skalski, P.D., Neuendorf, K.A., Cajigas, J.A.: Content analysis in the interactive media age. In: Neuendorf, K.A. (ed.) The content analysis guidebook, pp. 201–242. Sage, Thousand Oaks (2017)
StataCorp: Stata: Release 16 Statistical Software. StataCorp LP, College Station (2019)
Stern, S., Morr, L.: Portrayals of teen smoking, drinking, and drug use in recent popular movies. J. Health Communication. 18(2), 179–191 (2013). doi:https://doi.org/10.1080/10810730.2012.688251
Streiner, D.L.: Learning how to differ: Agreement and reliability statistics in psychiatry. Can. J. Psychiatry 40(2), 60–66 (1995)
Uebersax, J.S.: Modeling approaches for the analysis of observer agreement. Invest. Radiol. 27(9), 738–743 (1992)
Ullman, J.B., Bentler, P.M.: Structural equation modeling. In: Weiner, I.B. (ed.) Handbook of Psychology, Second edn., pp. 661–690. Wiley (2012)
Underwood, J.M., Brener, N., Thornton, J., Harris, W.A., Bryan, L.N., Shanklin, S.L.,. . Chyen, D.: Overview and methods for the Youth Risk Behavior Surveillance System—United States, 2019. MMWR supplements, 69(1), 1 (2020)
Vassallo, A.J., Kelly, B., Zhang, L., Wang, Z., Young, S., Freeman, B.: Junk Food Marketing on Instagram: Content Analysis. JMIR Public. Health and Surveillance, 4(2) (2018). doi:https://doi.org/10.2196/publichealth.9594
Vercammen, K.A., Koma, J.W., Bleich, S.N.: Trends in energy drink consumption among US adolescents and adults, 2003–2016. Am. J. Prev. Med. 56(6), 827–833 (2019)
Zickar, M., Highhouse, S.: Looking closer at the effects of framing on risky choice: An item response theory analysis. Organ. Behav. Hum Decis. Process. 75(1), 75–91 (1998)
Zytnick, D., Park, S., Onufrak, S.J.: Child and caregiver attitudes about sports drinks and weekly sports drink intake among US youth. Am. J. Health Promotion 30(3), e110–e119 (2016)
Acknowledgements
Funded by the US National Institute of Dental and Craniofacial Research (NIH/NIDCR, grant number R21DE028414-01). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIDCR. We thank our coders (Sean Hinton, Leah Yaker, Hallie Rubinstein, and Julia Sciacca, Charles Zoeller) for their dedication to and effort on this project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Disclosure Statement
No conflicts of interest declared.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hennessy, M., Bleakley, A. & Ellithorpe, M.E. Evaluating and tracking qualitative content coder performance using item response theory. Qual Quant 57, 1231–1245 (2023). https://doi.org/10.1007/s11135-022-01397-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-022-01397-7