Abstract
The use of genre metadata has been proposed as a potentially beneficial supplement to general web search engines. A key issue in this solution is the selection of genre labels and definitions for web pages. What genres should be used in a general search engine? How are these genres to be identified? What are effective methodologies for collecting user terminology for the purpose of deriving web page genre labels? Three criteria for effective labels are proposed. In light of these criteria, traditional genre theory is applied to the web. The existing research literature is examined, focusing on the results of a series of studies in which the feedback of almost 300 users was solicited for the purpose of building a classification of genre labels for web pages from the .edu Internet domain. The chapter includes discussion of the implications of our findings for future studies of web genre, including recommendations for best practice.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Anderson, J., and J. Perez-Carballo. 2005. Information retrieval design. St. Petersberg, FL: Ometeca Institute.
Bhatia, V. 1993. Analysing genre: Language use in professional settings. London and New York, NY: Longman.
Callan, J., J. Allan, C. Clarke, S. Dumais, D. Evans, M. Sanderson, and C. Zhai. 2007. Meeting of the MINDS: an information retrieval research agenda. SIGIR Forum 41:25–34.
Crowston, K., and B. Kwasnik. 2003. Can document-genre metadata improve information access to large digital collections? Library Trends, 52:345–361.
Dillon, A., and B. Gushrowski. 2000. Genres and the Web: Is the personal home page the first uniquely digital genre? Journal of American Society for Information Science 51:202–205.
Fidel, R. 1991. Searchers’ selection of search keys: I. The selection routine. Journal of the American Society Information Science 42:490–500.
Francis, W., and H. Kucera. 1982. Frequency analysis of English usage. New York, NY: Houghton Mifflin Co.
Goldstein, J., G. Ciany, and J. Carbonell. 2007. Genre identification and goal-focused summarization. In Proceedings of the 16th ACM Conference on Information and Knowledge Management, 889–892. New York, NY: ACM Press.
Haas, S., and E. Grams. 2000. Readers, authors and page structure: A discussion of four questions arising from a content analysis of web pages. Journal of American Society for Information Science 51:181–192.
Harman, D. 2007. Meeting of the MINDS: Future directions for human language technology executive summary. http://www.itl.nist.gov/iaui/894.02/minds.html
Herring, S., L. Scheidt, S. Bonus, and E. Wright. 2004. Bridging the gap: A genre analysis of weblogs. In Proceedings of the 38nd Annual Hawaii International Conference on Systems Sciences. IEEE Computer Society Press.
Karlgren, J., I. Bretan, J. Dewe, A. Hallberg, and N. Wolkert. 1998. Iterative information retrieval using fast clustering and usage-specific genres. In Eighth DELOS workshop – user interface in digital libraries, 85–92. Stockholm, Sweden, October 21–23, 1998.
Lee, D. 2001. Genres, registers, text types, domains, and styles: Clarifying the concepts and navigating a path through The BNC jungle. Language, Learning & Technology 5:37–72.
Miller, C. 1984. Genre as social action. Quarterly Journal of Speech 70:151–167.
Nilan, M., J. Pomerantz, and S. Paling. 2001. Genres from the bottom up: What has the Web brought us? In: Proceedings of the American Society for Information Science and Technology Annual Meeting, 330–339. Washington, DC, November 2–8, 2001.
Rehm, G. 2002. Towards automatic Web genre identification. In: Proceedings of the 35th Annual Hawaii International Conference on Systems Sciences, 1143–1152. Los Alamitos, CA: IEEE Computer Society Press.
Rugg, G., and P. McGeorge. 1997. The sorting techniques: A tutorial paper on card sorts, picture sorts and item sorts. Expert Systems 14:80–93.
Rosso, M. 2005. Using genre to improve Web search. Unpublished doctoral dissertation, University of North Carolina, Chapel Hill, NC. http://ils.unc.edu/~rossm/Rosso_dissertation.pdf.
Rosso, M. 2008. User-based identification of web genres. Journal of the American Society for Information Science and Technology 59:1053–1072.
Santini, M. 2006. Common criteria for genre classification: Annotation and granularity. In Proceedings of the Workshop on Text-Based Information Retrieval Held in Conjunction with the European Conference on Artificial Intelligence.
Stein, B., and S. Meyer zu Eissen. 2004. Genre classification of web pages. In Proceedings of the 27th German Conference on Artificial Intelligence. Ulm, Germany.
Swales, J. 1990. Genre analysis: English in academic and research settings. Cambridge, UK: Cambridge University Press.
Toms, E., D. Campbell, and R. Blades. 1999. Does genre define the shape of information: The role of form and function in user interaction with digital documents. In: Proceedings of the 62th American Society for Information Science Annual Meeting, 693–704. Washington, DC, October 31 – November 4, 1999.
Yates, J., and W. Orlikowski. 1992. Genres of organizational communication: A structurational approach to studying communication and media. Academy of Management Review 17:299–326.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Rosso, M.A., Haas, S.W. (2010). Identification of Web Genres by User Warrant. In: Mehler, A., Sharoff, S., Santini, M. (eds) Genres on the Web. Text, Speech and Language Technology, vol 42. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9178-9_3
Download citation
DOI: https://doi.org/10.1007/978-90-481-9178-9_3
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9177-2
Online ISBN: 978-90-481-9178-9
eBook Packages: Computer ScienceComputer Science (R0)