Find, New, Copy, Web, Page - Tagging for the (Re-)Discovery of Web Pages

  • Martin Klein
  • Michael L. Nelson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6966)

Abstract

The World Wide Web has a very dynamic character with resources constantly disappearing and (re-)surfacing. A ubiquitous result is the “404 Page not Found” error as the request for missing web pages. We investigate tags obtained from Delicious for the purpose of rediscovering such missing web pages with the help of search engines. We determine the best performing tag based query length, quantify the relevance of the results and compare tags to retrieval methods based on a page’s content. We find that tags are only useful in addition to content based methods. We further introduce the notion of “ghost tags”, terms used as tags that do not occur in the current but did occur in a previous version of the web page. One third of these ghost tags are ranked high in Delicious and also occurred frequently in the document which indicates their importance to both the user and the content of the document.

Keywords

Retrieval Performance Mean Average Precision Binary Relevance Query Length Social Bookmark 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agichtein, E., Zheng, Z.: Identifying ”Best Bet” Web Search Results by Mining Past User Behavior. In: Proceedings of KDD 2006, pp. 902–908 (2006)Google Scholar
  2. 2.
    Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing Web Search Using Social Annotations. In: Proceedings of WWW 2007, pp. 501–510 (2007)Google Scholar
  3. 3.
    Berners-Lee, T.: Cool URIs don’t change (1998), http://www.w3.org/Provider/Style/URI.html
  4. 4.
    Bischoff, K., Firan, C., Nejdl, W., Paiu, R.: Can All Tags Be Used for Search? In: Proceedings of CIKM 2008, pp. 193–202 (2008)Google Scholar
  5. 5.
    Heymann, P., Koutrika, G., Garcia-Molina, H.: Can Social Bookmarking Improve Web Search? In: Proceedings of WSDM 2008, pp. 195–206 (2008)Google Scholar
  6. 6.
    Jason Morrison, P.: Tagging and Searching: Search Retrieval Effectiveness of Folksonomies on the World Wide Web. Information Processing and Management 44, 1562–1579 (2008)CrossRefGoogle Scholar
  7. 7.
    Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately Interpreting Clickthrough Data as Implicit Feedback. In: Proceedings of SIGIR 2005, pp. 154–161 (2005)Google Scholar
  8. 8.
    Klein, M., Nelson, M.L.: Revisiting lexical signatures to (Re-)Discover web pages. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds.) ECDL 2008. LNCS, vol. 5173, pp. 371–382. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Klein, M., Nelson, M.L.: Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure. In: Proceedings of JCDL 2010, pp. 59–68 (2010)Google Scholar
  10. 10.
    Klein, M., Shipman, J., Nelson, M.L.: Is This a Good title? In: Proceedings of Hypertext 2010, pp. 3–12 (2010)Google Scholar
  11. 11.
    Klein, M., Ware, J., Nelson, M.L.: Rediscovering Missing Web Pages Using Link Neighborhood Lexical Signatures. In: Proceedings of JCDL 2011 (2011)Google Scholar
  12. 12.
    Krause, B., Hotho, A., Stumme, G.: A Comparison of Social Bookmarking with Traditional Search. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 101–113. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Marshall, C.C., McCown, F., Nelson, M.L.: Evaluating personal archiving strategies for internet-based information (2007)Google Scholar
  14. 14.
    Sugiyama, K., Hatano, K., Yoshikawa, M., Uemura, S.: Refinement of TF-IDF Schemes for Web Pages using their Hyperlinked Neighboring Pages. In: Proceedings of HYPERTEXT 2003, pp. 198–207 (2003)Google Scholar
  15. 15.
    Van de Sompel, H., Nelson, M.L., Sanderson, R., Balakireva, L., Ainsworth, S., Shankar, H.: Memento: Time Travel for the Web. Technical Report arXiv:0911.1112 (2009)Google Scholar
  16. 16.
    Yanbe, Y., Jatowt, A., Nakamura, S., Tanaka, K.: Can Social Bookmarking Enhance Search in the Web? In: Proceedings of JCDL 2007, pp. 107–116 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Martin Klein
    • 1
  • Michael L. Nelson
    • 1
  1. 1.Department of Computer ScienceOld Dominion UniversityNorfolk

Personalised recommendations