Advertisement

Identification of Web Genres by User Warrant

  • Mark A. RossoEmail author
  • Stephanie W. Haas
Chapter
Part of the Text, Speech and Language Technology book series (TLTB, volume 42)

Abstract

The use of genre metadata has been proposed as a potentially beneficial supplement to general web search engines. A key issue in this solution is the selection of genre labels and definitions for web pages. What genres should be used in a general search engine? How are these genres to be identified? What are effective methodologies for collecting user terminology for the purpose of deriving web page genre labels? Three criteria for effective labels are proposed. In light of these criteria, traditional genre theory is applied to the web. The existing research literature is examined, focusing on the results of a series of studies in which the feedback of almost 300 users was solicited for the purpose of building a classification of genre labels for web pages from the .edu Internet domain. The chapter includes discussion of the implications of our findings for future studies of web genre, including recommendations for best practice.

Keywords

Genre Web search Classification User warrant Metadata Annotation 

References

  1. 1.
    Anderson, J., and J. Perez-Carballo. 2005. Information retrieval design. St. Petersberg, FL: Ometeca Institute.Google Scholar
  2. 2.
    Bhatia, V. 1993. Analysing genre: Language use in professional settings. London and New York, NY: Longman.Google Scholar
  3. 3.
    Callan, J., J. Allan, C. Clarke, S. Dumais, D. Evans, M. Sanderson, and C. Zhai. 2007. Meeting of the MINDS: an information retrieval research agenda. SIGIR Forum 41:25–34.CrossRefGoogle Scholar
  4. 4.
    Crowston, K., and B. Kwasnik. 2003. Can document-genre metadata improve information access to large digital collections? Library Trends, 52:345–361.Google Scholar
  5. 5.
    Dillon, A., and B. Gushrowski. 2000. Genres and the Web: Is the personal home page the first uniquely digital genre? Journal of American Society for Information Science 51:202–205.CrossRefGoogle Scholar
  6. 6.
    Fidel, R. 1991. Searchers’ selection of search keys: I. The selection routine. Journal of the American Society Information Science 42:490–500.CrossRefGoogle Scholar
  7. 7.
    Francis, W., and H. Kucera. 1982. Frequency analysis of English usage. New York, NY: Houghton Mifflin Co.Google Scholar
  8. 8.
    Goldstein, J., G. Ciany, and J. Carbonell. 2007. Genre identification and goal-focused summarization. In Proceedings of the 16th ACM Conference on Information and Knowledge Management, 889–892. New York, NY: ACM Press.Google Scholar
  9. 9.
    Haas, S., and E. Grams. 2000. Readers, authors and page structure: A discussion of four questions arising from a content analysis of web pages. Journal of American Society for Information Science 51:181–192.CrossRefGoogle Scholar
  10. 10.
    Harman, D. 2007. Meeting of the MINDS: Future directions for human language technology executive summary. http://www.itl.nist.gov/iaui/894.02/minds.html
  11. 11.
    Herring, S., L. Scheidt, S. Bonus, and E. Wright. 2004. Bridging the gap: A genre analysis of weblogs. In Proceedings of the 38nd Annual Hawaii International Conference on Systems Sciences. IEEE Computer Society Press.Google Scholar
  12. 12.
    Karlgren, J., I. Bretan, J. Dewe, A. Hallberg, and N. Wolkert. 1998. Iterative information retrieval using fast clustering and usage-specific genres. In Eighth DELOS workshop – user interface in digital libraries, 85–92. Stockholm, Sweden, October 21–23, 1998.Google Scholar
  13. 13.
    Lee, D. 2001. Genres, registers, text types, domains, and styles: Clarifying the concepts and navigating a path through The BNC jungle. Language, Learning & Technology 5:37–72.Google Scholar
  14. 14.
    Miller, C. 1984. Genre as social action. Quarterly Journal of Speech 70:151–167.CrossRefGoogle Scholar
  15. 15.
    Nilan, M., J. Pomerantz, and S. Paling. 2001. Genres from the bottom up: What has the Web brought us? In: Proceedings of the American Society for Information Science and Technology Annual Meeting, 330–339. Washington, DC, November 2–8, 2001.Google Scholar
  16. 16.
    Rehm, G. 2002. Towards automatic Web genre identification. In: Proceedings of the 35th Annual Hawaii International Conference on Systems Sciences, 1143–1152. Los Alamitos, CA: IEEE Computer Society Press.Google Scholar
  17. 17.
    Rugg, G., and P. McGeorge. 1997. The sorting techniques: A tutorial paper on card sorts, picture sorts and item sorts. Expert Systems 14:80–93.CrossRefGoogle Scholar
  18. 18.
    Rosso, M. 2005. Using genre to improve Web search. Unpublished doctoral dissertation, University of North Carolina, Chapel Hill, NC. http://ils.unc.edu/~rossm/Rosso_dissertation.pdf.Google Scholar
  19. 19.
    Rosso, M. 2008. User-based identification of web genres. Journal of the American Society for Information Science and Technology 59:1053–1072.CrossRefGoogle Scholar
  20. 20.
    Santini, M. 2006. Common criteria for genre classification: Annotation and granularity. In Proceedings of the Workshop on Text-Based Information Retrieval Held in Conjunction with the European Conference on Artificial Intelligence.Google Scholar
  21. 21.
    Stein, B., and S. Meyer zu Eissen. 2004. Genre classification of web pages. In Proceedings of the 27th German Conference on Artificial Intelligence. Ulm, Germany.Google Scholar
  22. 22.
    Swales, J. 1990. Genre analysis: English in academic and research settings. Cambridge, UK: Cambridge University Press.Google Scholar
  23. 23.
    Toms, E., D. Campbell, and R. Blades. 1999. Does genre define the shape of information: The role of form and function in user interaction with digital documents. In: Proceedings of the 62th American Society for Information Science Annual Meeting, 693–704. Washington, DC, October 31 – November 4, 1999.Google Scholar
  24. 24.
    Yates, J., and W. Orlikowski. 1992. Genres of organizational communication: A structurational approach to studying communication and media. Academy of Management Review 17:299–326.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.School of Business, North Carolina Central UniversityDurhamUSA
  2. 2.School of Information & Library Science, University of North CarolinaChapel HillUSA

Personalised recommendations