Improving Europeana Search Experience Using Query Logs

  • Diego Ceccarelli
  • Sergiu Gordea
  • Claudio Lucchese
  • Franco Maria Nardini
  • Gabriele Tolomei
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6966)


Europeana is a long-term project funded by the European Commission with the goal of making Europe’s cultural and scientific heritage accessible to the public. Since 2008, about 1500 institutions have contributed to Europeana, enabling people to explore the digital resources of Europe’s museums, libraries and archives. The huge amount of collected multi-lingual multi-media data is made available today through the Europeana portal, a search engine allowing users to explore such content through textual queries. One of the most important techniques for enhancing users search experience in large information spaces, is the exploitation of the knowledge contained in query logs. In this paper we present a characterization of the Europeana query log, showing statistics on common behavioral patterns of the Europeana users. Our analysis highlights some significative differences between the Europeana query log and the historical data collected by general purpose Web Search Engine logs. In particular, we find out that both query and search session distributions show different behaviors. Finally, we use this information for designing a query recommendation technique having the goal of enhancing the functionality of the Europeana portal.


Session Length Successful Session Search Session Successful Query Virtual Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE TKDE 17(6), 734–749 (2005)Google Scholar
  2. 2.
    Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proc. SIGIR 2007, pp. 183–190. ACM, New York (2007)Google Scholar
  3. 3.
    Baraglia, R., Cacheda, F., Carneiro, V., Fernandez, D., Formoso, V., Perego, R., Silvestri, F.: Search shortcuts: a new approach to the recommendation of queries. In: Proc. RecSys 2009. ACM, New York (2009)Google Scholar
  4. 4.
    Beitzel, S.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of a very large topically categorized web query log. In: Proc. SIGIR 2004. ACM Press, New York (2004)Google Scholar
  5. 5.
    Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S.: The query-flow graph: model and applications. In: Proc. CIKM 2008. ACM, New York (2008)Google Scholar
  6. 6.
    Broccolo, D., Marcon, L., Nardini, F.M., Perego, R., Silvestri, F.: An efficient algorithm to generate search shortcuts. Tech. Rep. 2010-TR-017, CNR ISTI Pisa (2010)Google Scholar
  7. 7.
    Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst. 24, 51–78 (2006)CrossRefGoogle Scholar
  8. 8.
    Gordea, S., Zanker, M.: Time filtering for better recommendations with small and sparse rating matrices. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 171–183. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    He, D., Göker, A.: Detecting session boundaries from web user logs. In: BCS-IRSG, pp. 57–66 (2000)Google Scholar
  10. 10.
    Hsieh-yee, L.: Effects of search experience and subject knowledge on the search tactics of novice and experienced searchers. JASIS 44, 161–174 (1993)CrossRefGoogle Scholar
  11. 11.
    Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: CIKM 2008, pp. 699–708. ACM, New York (2008)Google Scholar
  12. 12.
    Lempel, R., Moran, S.: Predictive caching and prefetching of query results in search engines. In: Proc. WWW 2003, pp. 19–28. ACM, New York (2003)Google Scholar
  13. 13.
    Lucchese, C., Orlando, S., Perego, R., Silvestri, F., Tolomei, G.: Identifying task-based sessions in search engine query logs. In: Proc. WSDM 2011, pp. 277–286. ACM, New York (2011)Google Scholar
  14. 14.
    Markatos, E.P.: On caching search engine query results. In: Computer Communications, p. 2001 (2000)Google Scholar
  15. 15.
    Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proc. KDD 2005. ACM Press, New York (2005)Google Scholar
  16. 16.
    Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)CrossRefGoogle Scholar
  17. 17.
    Siegfried, S., Bates, M., Wilde, D.: A profile of end-user searching behavior by humanities scholars: The Getty Online Searching Project Report No. 2. JASIS 44(5), 273–291 (1993)CrossRefGoogle Scholar
  18. 18.
    Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33, 6–12 (1999)CrossRefGoogle Scholar
  19. 19.
    Silvestri, F.: Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval 1(1-2), 1–174 (2010)CrossRefzbMATHGoogle Scholar
  20. 20.
    Spink, A., Saracevic, T.: Interaction in information retrieval: selection and effectiveness of search terms. JASIS 48(8), 741–761 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Diego Ceccarelli
    • 1
    • 2
  • Sergiu Gordea
    • 3
  • Claudio Lucchese
    • 1
  • Franco Maria Nardini
    • 1
  • Gabriele Tolomei
    • 1
    • 4
  1. 1.ISTI–CNRPisaItaly
  2. 2.Dipartimento di InformaticaUniversità di PisaItaly
  3. 3.AIT Austrian Institute of Technology GmbHWienAustria
  4. 4.Università Ca’ FoscariVeneziaItaly

Personalised recommendations