Abstract
Are established methods of content analysis (CA) adequate to analyze web content, or should new methods be devised to address new technological developments? This article addresses this question by contrasting narrow and broad interpretations of the concept of web content analysis. The utility of a broad interpretation that subsumes the narrow one is then illustrated with reference to research on weblogs (blogs), a popular web format in which features of HTML documents and interactive computer-mediated communication converge. The article concludes by proposing an expanded Web Content Analysis (WebCA) paradigm in which insights from paradigms such as discourse analysis and social network analysis are operationalized and implemented within a general content analytic framework.
Keywords
- Content Analysis
- Social Network Analysis
- Discourse Analysis
- Methodological Paradigm
- Content Analysis Study
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
While McMillan (2000) acknowledges that “the size of the sample depends on factors such as the goals of the study” (p. 2, emphasis added), she does not mention that different research goals/questions might call for different types of samples. Rather, she asserts that random samples are required for “rigor” in all CA studies—a claim that many researchers would dispute (see, e.g., note 5).
- 2.
- 3.
In a review of 25 years of content analyses, Riffe and Freitag (1997; cited in Weare & Lin, 2000) found that most studies were based on convenience or purposive samples; only 22.2% of the studies attempted to be representative of the population of interest.
- 4.
On grounded theory, see Glaser and Strauss (1967).
- 5.
Herring (2004, p. 350) notes that “in CMDA, [sampling] is rarely done randomly, since random sampling sacrifices context, and context is important in interpreting discourse analysis results.”
- 6.
This estimate is based on a report that the number of blogs created at major hosts was 134-144 million in October 2005 (http://www.blogherald.com/2005/10/10/the-blog-herald-blog-count-october-2005/, accessed December 7, 2007). Blog creation, especially in countries outside the U.S., has increased since then, although many blogs have also been abandoned (Wikipedia, June 28, 2008).
- 7.
The (We)blog Research on Genre (BROG) project. See http://en.wikipedia.org/wiki/BROG, accessed August 26, 2009.
- 8.
For example, Herring, Scheidt, et al. (2004, 2005) found that contrary to popular claims that blog entries typically contain links and link often to other blogs, the average number of links in entries in randomly-selected blogs was .65, and most entries contained 0 links. Moreover, the majority of links were to websites created by others, with links to other blogs coming in a distant third.
- 9.
See, e.g., Herring, Scheidt, et al. (2004, 2005); Mishne and Glance (2006).
- 10.
This study is an exception to the generalization that most computational web studies do not orient toward content analysis. The stated goal of Nakajima et al. (2005, p. 1) is to capture and analyze “conversational web content” in blogs.
References
Ali-Hasan, N., & Adamic, L. (2007). Expressing social relationships on the blog through links and comments. Paper presented at the international conference for weblogs and social media, Boulder, CO.
Balog, K., Mishne, G., & Rijke, M. (2006). Why are they excited? Identifying and explaining spikes in blog mood levels. Paper presented at the 11th meeting of the European Chapter of the Association for Computational Linguistics, Trento, Italy.
Baran, S. J. (2002). Introduction to mass communication (2nd ed.) New York: McGraw-Hill.
Bates, M. J., & Lu, S. (1997). An exploratory profile of personal home pages: Content, design, metaphors. Online and CDROM Review, 21(6), 331–340.
Bauer, M. (2000). Classical content analysis: A review. In M. W. Bauer & G. Gaskell (Eds.), Qualitative researching with text, image, and sound: A practical handbook (pp. 131–151). London: Sage.
Berelson, B. (1952). Content analysis in communication research. New York: Free Press.
Berelson, B., & Lazarsfeld, P. F. (1948). The analysis of communication content. Chicago/New York: University of Chicago and Columbia University.
Blood, R. (2002). Introduction. In J. Rodzvilla (Ed.), We’ve got blog: How weblogs are changing our culture (pp. ix–xiii). Cambridge, MA: Perseus.
Bush, C. R. (1951). The analysis of political campaign news. Journalism Quarterly, 28(2), 250–252.
Dimitrova, D. V., & Neznanski, M. (2006). Online journalism and the war in cyberspace: A comparison between U.S. and international newspapers. Journal of Computer-Mediated Communication, 12(1), Article 13. Retrieved from http://jcmc.indiana.edu/vol12/issue1/dimitrova.html
Efimova, L., & de Moor, A. (2005). Beyond personal web publishing: An exploratory study of conversational blogging practices. Proceedings of the Thirty-Eighth Hawaii International Conference on System Sciences. Los Alamitos, CA: IEEE.
Fogg, B. J., Kameda, T., Boyd, J., Marshall, J., Sethi, R., Sockol, M., et al. (2002). Stanford-Makovsky web credibility study 2002: Investigating what makes web sites credible today. Retrieved from http://captology.stanford.edu/pdf/Stanford-MakovskyWebCredStudy2002-prelim.pdf
Foot, K. A., Schneider, S. M., Dougherty, M., Xenos, M., & Larsen, E. (2003). Analyzing linking practices: Candidate sites in the 2002 U.S. electoral Web sphere. Journal of Computer-Mediated Communication, 8(4). Retrieved from http://jcmc.indiana.edu/vol8/issue4/foot.html
Gibson, G., Kleinberg, J., & Raghavan, P. (1998). Inferring web communities from link topology. Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Pittsburgh, PA: ACM.
Glaser, B., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research. Chicago: Aldine.
Herring, S. C. (2004). Computer-mediated discourse analysis: An approach to researching online behavior. In S. A. Barab, R. Kling, & J. H. Gray (Eds.), Designing for virtual communities in the service of learning (pp. 338–376). New York: Cambridge University Press.
Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in weblogs. Journal of Sociolinguistics, 10(4), 439–459.
Herring, S. C., Kouper, I., Paolillo, J., Scheidt, L. A., Tyworth, M., Welsch, P., et al. (2005). Conversations in the blogosphere: An analysis “from the bottom up.” Proceedings of the Thirty-Eighth Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.
Herring, S. C., Scheidt, L. A., Bonus, S., & Wright, E. (2004). Bridging the gap: A genre analysis of weblogs. Proceedings of the Thirty-Seventh Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.
Herring, S. C., Scheidt, L. A., Bonus, S., & Wright, E. (2005). Weblogs as a bridging genre. Information, Technology & People, 18(2), 142–171.
Herring, S. C., Scheidt, L. A., Kouper, I., & Wright, E. (2006). Longitudinal content analysis of weblogs: 2003–2004. In M. Tremayne (Ed.), Blogging, citizenship, and the future of media (pp. 3–20). London: Routledge.
Holsti, O. R. (1969). Content analysis for the social sciences and humanities. Reading, MA: Addison Wesley.
Huffaker, D. A., & Calvert, S. L. (2005). Gender, identity and language use in teenage blogs. Journal of Computer-Mediated Communication, 10(2). Retrieved from http://jcmc.indiana.edu/vol10/issue2/huffaker.html
Jackson, M. (1997). Assessing the structure of communication on the world wide web. Journal of Computer-Mediated Communication, 3(1). Retrieved from http://www.ascusc.org/jcmc/vol3/issue1/jackson.html
Krippendorff, K. (1980). Content analysis: An introduction to its methodology. Newbury Park: Sage.
Krippendorff, K. (2008). Testing the reliability of content analysis data: What is involved and why. In K. Krippendorff & M. A. Bock (Eds.), The content analysis reader (pp. 350–357). Thousand Oaks, CA: Sage. Retrieved from http://www.asc.upenn.edu/usr/krippendorff/dogs.html
Kutz, D. O., & Herring, S. C. (2005). Micro-longitudinal analysis of web news updates. Proceedings of the Thirty-Eighth Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.
McMillan, S. J. (2000). The microscope and the moving target: The challenge of applying content analysis to the world wide web. Journalism and Mass Communication Quarterly, 77(1), 80–98.
Mishne, G., & Glance, N. (2006). Leave a reply: An analysis of weblog comments. Proceedings of the 3rd Annual Workshop on the Weblogging Ecosystem, 15th World Wide Web Conference, Edinburgh.
Mitra, A. (1999). Characteristics of the WWW text: Tracing discursive strategies. Journal of Computer-Mediated Communication, 5(1). Retrieved from http://www.ascusc.org/jcmc/vol5/issue1/mitra.html
Mitra, A., & Cohen, E. (1999). Analyzing the web: Directions and challenges. In S. Jones (Ed.), Doing internet research: Critical issues and methods for examining the net (pp. 179–202). Thousand Oaks, CA: Sage.
Nakajima, S., Tatemura, J., Hino, Y., Hara, Y., & Tanaka, K. (2005). Discovering important bloggers based on analyzing blog threads. Paper presented at WWW2005, Chiba, Japan.
Park, H. W. (2003). What is hyperlink network analysis? New method for the study of social structure on the web. Connections, 25(1), 49–61.
Pfeil, U., Zaphiris, P., & Ang, C. S. (2006). Cultural differences in collaborative authoring of Wikipedia. Journal of Computer-Mediated Communication, 12(1), Article 5. Retrieved from http://jcmc.indiana.edu/vol12/issue1/pfeil.html
Scheidt, L. A., & Wright, E. (2004). Common visual design elements of weblogs. In L. Gurak, S. Antonijevic, L. Johnson, C. Ratliff, & J. Reyman (Eds.), Into the blogosphere: Rhetoric, community, and culture of weblogs. Retrieved from http://blog.lib.umn.edu/blogosphere/
Schneider, S. M., & Foot, K. A. (2004). The web as an object of study. New Media & Society, 6(1), 114–122.
Scott, W. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 17, 321–325.
Singh, N., & Baack, D. W. (2004). Web site adaptation: A cross-cultural comparison of U.S. and Mexican web sites. Journal of Computer-Mediated Communication, 9(4). Retrieved from http://jcmc.indiana.edu/vol9/issue4/singh_baack.html
Thelwall, M. (2002). The top 100 linked pages on UK university web sites: High inlink counts are not usually directly associated with quality scholarly content. Journal of Information Science, 28(6), 485–493.
Trammell, K. D. (2006). Blog offensive: An exploratory analysis of attacks published on campaign blog posts from a political public relations perspective. Public Relations Review, 32(4), 402–406.
Trammell, K. D., Tarkowski, A., Hofmokl, J., & Sapp, A. M. (2006). Rzeczpospolita blogów [Republic of Blog]: Examining Polish bloggers through content analysis. Journal of Computer-Mediated Communication, 11(3), Article 2. Retrieved from http://jcmc.indiana.edu/vol11/issue3/trammell.html
Tremayne, M., Zheng, N., Lee, J. K., & Jeong, J. (2006). Issue publics on the web: Applying network theory to the war blogosphere. Journal of Computer-Mediated Communication, 12(1), Article 15. Retrieved from http://jcmc.indiana.edu/vol12/issue1/tremayne.html
Wakeford, N. (2000). New media, new methodologies: Studying the web. In D. Gauntlett (Ed.), Web.studies: Rewiring media studies for the digital age (pp. 31–42). London: Arnold.
Waseleski, C. (2006). Gender and the use of exclamation points in computer-mediated communication: An Analysis of exclamations posted to two electronic discussion lists. Journal of Computer-Mediated Communication, 11(4), Article 6. Retrieved http://jcmc.indiana.edu/vol11/issue4/waseleski.html
Weare, C., & Lin, W. Y. (2000). Content analysis of the world wide web – Opportunities and challenges. Social Science Computer Review, 18(3), 272–292.
Wikipedia. (2008). Blog. Retrieved on June 28, 2008, from http://en.wikipedia.org/wiki/Blog
Williams, P., Tramell, K., Postelnicu, M., Landreville, K., & Martin, J. (2005). Blogging and hyperlinking: Use of the web to enhance visibility during the 2004 U.S. campaign. Journalism Studies, 6(2), 177–186.
Young, J., & Foot, K. (2005). Corporate e-cruiting: The construction of work in Fortune 500 recruiting web sites. Journal of Computer-Mediated Communication, 11(1), Article 3. Retrieved from http://jcmc.indiana.edu/vol11/issue1/young.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Herring, S.C. (2009). Web Content Analysis: Expanding the Paradigm. In: Hunsinger, J., Klastrup, L., Allen, M. (eds) International Handbook of Internet Research. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-9789-8_14
Download citation
DOI: https://doi.org/10.1007/978-1-4020-9789-8_14
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-9788-1
Online ISBN: 978-1-4020-9789-8
eBook Packages: Computer ScienceComputer Science (R0)