Evaluating Web Archive Search Systems

  • Miguel Costa
  • Mário J. Silva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7651)


The information published on the web, a representation of our collective memory, is rapidly vanishing. At least 77 web archives have been developed to cope with the web’s transience problem, but despite their technology having achieved a good maturity level, the retrieval effectiveness of the search services they provide still presents unsatisfactory results. In this work, we propose an evaluation methodology for web archive search systems based on a list of requirements compiled from previous characterizations of web archives and their users. The methodology includes the design of a test collection and the selection of evaluation measures to support realistic and reproducible experiments. The test collection enabled, for the first time, to measure the effectiveness of state-of-the-art IR technology employed in web archives. Results confirm the poor quality of search results retrieved with such technology. However, we show how to combine temporal features, along with the regular topical features, to improve the search effectiveness on web archives. The test collection is available to the research community.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kitsuregawa, M., Tamura, T., Toyoda, M., Kaji, N.: Socio-Sense: A System for Analysing the Societal Behavior from Long Term Web Archive. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds.) APWeb 2008. LNCS, vol. 4976, pp. 1–8. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Yamamoto, Y., Tezuka, T., Jatowt, A., Tanaka, K.: Honto? Search: Estimating Trustworthiness of Web Information by Search Results Aggregation and Temporal Analysis. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 253–264. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Chung, Y., Toyoda, M., Kitsuregawa, M.: A study of link farm distribution and evolution using a time series of web snapshots. In: Proc. of the 5th International Workshop on Adversarial Information Retrieval on the Web, pp. 9–16 (2009)Google Scholar
  4. 4.
    Elsas, J., Dumais, S.: Leveraging temporal dynamics of document content in relevance ranking. In: Proc. of the 3rd ACM Inter. Conference on Web Search and Data Mining, pp. 1–10 (2010)Google Scholar
  5. 5.
    Gomes, D., Miranda, J., Costa, M.: A Survey on Web Archiving Initiatives. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 408–420. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Voorhees, E., Harman, D.: TREC: Experiment and evaluation in information retrieval. MIT Press (2005)Google Scholar
  7. 7.
    Masanès, J.: Web Archiving. Springer-Verlag New York Inc. (2006)Google Scholar
  8. 8.
    Foundation, I.M.: Web archiving in Europe. Technical report, CommerceNet Labs (2010)Google Scholar
  9. 9.
    Ras, M., van Bussel, S.: Web archiving user survey. Technical report, National Library of the Netherlands (Koninklijke Bibliotheek) (2007)Google Scholar
  10. 10.
    Costa, M., Silva, M.J.: Characterizing search behavior in web archives. In: Proc. of the 1st International Temporal Web Analytics Workshop (2011)Google Scholar
  11. 11.
    Cohen, D., Amitay, E., Carmel, D.: Lucene and Juru at Trec 2007: 1-million queries track. In: Proc. of the 16th Text REtrieval Conference (2007)Google Scholar
  12. 12.
    Kelly, D.: Methods for evaluating interactive information retrieval systems with users. Foundations and Trends in Information Retrieval, vol. 3. Now Publishers Inc. (2009)Google Scholar
  13. 13.
    Aula, A., Khan, R.M., Guan, Z.: How does search behavior change as search becomes more difficult? In: Proc. of the 28th International Conference on Human Factors in Computing Systems, pp. 35–44 (2010)Google Scholar
  14. 14.
    Kellar, M., Watters, C., Shepherd, M.: A field study characterizing Web-based information-seeking tasks. American Society for Information Science and Technology 58(7), 999–1018 (2007)CrossRefGoogle Scholar
  15. 15.
    Baeza-Yates, R., Castillo, C., Efthimiadis, E.: Characterization of national web domains. ACM Transactions on Internet Technology 7(2) (2007)Google Scholar
  16. 16.
    Costa, M., Silva, M.J.: Understanding the information needs of web archive users. In: Proc. of the 10th International Web Archiving Workshop, pp. 9–16 (2010)Google Scholar
  17. 17.
    Costa, M., Silva, M.J.: A search log analysis of a Portuguese web search engine. In: Proc. of the 2nd INForum - Simpósio de Informática, pp. 525–536 (2010)Google Scholar
  18. 18.
    Jansen, B., Spink, A.: How are we searching the World Wide Web? A comparison of nine search engine transaction logs. Information Processing and Management 42(1), 248–263 (2006)CrossRefGoogle Scholar
  19. 19.
    Dong, A., Chang, Y., Zheng, Z., Mishne, G., Bai, J., Zhang, R., Buchner, K., Liao, C., Diaz, F.: Towards recency ranking in web search. In: Proc. of the 3rd ACM International Conference on Web Search and Data Mining, pp. 11–20 (2010)Google Scholar
  20. 20.
    Jones, R., Diaz, F.: Temporal profiles of queries. ACM Transactions on Information Systems (TOIS) 25(3) (2007)Google Scholar
  21. 21.
    Clarke, C., Kolla, M., Cormack, G., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proc. of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 659–666 (2008)Google Scholar
  22. 22.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proc. of the 2nd ACM International Conference on Web Search and Data Mining, pp. 5–14 (2009)Google Scholar
  23. 23.
    Burner, M., Kahle, B.: The Archive File Form (September 1996), http://www.archive.org/web/researcher/ArcFileFormat.php
  24. 24.
    Voorhees, E.: Topic set size redux. In: Proc. of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 806–807 (2009)Google Scholar
  25. 25.
    Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework. Foundations and Trends in Information Retrieval, vol. 3. Now Publishers Inc. (2009)Google Scholar
  26. 26.
    Al-Maskari, A., Sanderson, M., Clough, P.: Relevance judgments between TREC and Non-TREC assessors. In: Proc. of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 683–684 (2008)Google Scholar
  27. 27.
    Craswell, N., Hawking, D.: Overview of the TREC-2004 Web Track. NIST Special Publication, 500–261 (2005)Google Scholar
  28. 28.
    Lewandowski, D.: The retrieval effectiveness of search engines on navigational queries. Aslib Proceedings 63, 354–363 (2011)CrossRefGoogle Scholar
  29. 29.
    Blanco, R., Halpin, H., Herzig, D., Mika, P., Pound, J., Thompson, H., Tran Duc, T.: Repeatable and reliable search system evaluation using crowdsourcing. In: Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information, pp. 923–932 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Miguel Costa
    • 1
    • 2
  • Mário J. Silva
    • 3
  1. 1.Foundation for National Scientific ComputingLisbonPortugal
  2. 2.LaSIGE, Faculty of ScienceUniversity of LisbonLisbonPortugal
  3. 3.IST/INESC-IDLisbonPortugal

Personalised recommendations