Skip to main content

Collection Ranking and Selection for Federated Entity Search

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7608)

Abstract

Entity search has emerged as an important research topic over the past years, but so far has only been addressed in a centralized setting. In this paper we present an attempt to solve the task of ad-hoc entity retrieval in a cooperative distributed environment. We propose a new collection ranking and selection method for entity search, called AENN. The key underlying idea is that a lean, name-based representation of entities can efficiently be stored at the central broker, which, therefore, does not have to rely on sampling. This representation can then be utilized for collection ranking and selection in a way that the number of collections selected and the number of results requested from each collection is dynamically adjusted on a per-query basis. Using a collection of structured datasets in RDF and a sample of real web search queries targeting entities, we demonstrate that our approach outperforms state-of-the-art distributed document retrieval methods in terms of both effectiveness and efficiency.

Keywords

  • Mean Average Precision
  • Relevance Judgment
  • Mean Reciprocal Rank
  • Relevant Entity
  • Federate Entity

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balog, K., Soboroff, I., Thomas, P., Craswell, N., de Vries, A.P., Bailey, P.: Overview of the TREC 2008 enterprise track. In: TREC 2008. NIST (2009)

    Google Scholar 

  2. Balog, K., de Vries, A.P., Serdyukov, P., Thomas, P., Westerveld, T.: Overview of the TREC 2009 entity track. In: TREC 2009 (2010)

    Google Scholar 

  3. Blanco, R., Halpin, H., Herzig, D., Mika, P., Pound, J., Thompson, H., Duc, T.: Entity search evaluation over structured web data. In: EOS 2011 (2011)

    Google Scholar 

  4. Callan, J.: Distributed information retrieval. In: Advances in Information Retrieval. Kluwer Academic Publishers (2000)

    Google Scholar 

  5. Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: SIGIR 1995. ACM (1995)

    Google Scholar 

  6. de Vries, A.P., Vercoustre, A.-M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 Entity Ranking Track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 245–251. Springer, Heidelberg (2008)

    CrossRef  Google Scholar 

  7. Gravano, L., Garcia-Molina, H.: Generalizing GlOSS to vector-space databases and broker hierarchies. In: VLDB 1995 (1995)

    Google Scholar 

  8. Haas, K., Mika, P., Tarjan, P., Blanco, R.: Enhanced results for web search. In: SIGIR 2011. ACM (2011)

    Google Scholar 

  9. Halpin, H., Herzig, D.M., Mika, P., Blanco, R., Pound, J., Thompson, H.S., Tran, D.T.: Evaluating ad-hoc object retrieval. In: IWEST 2010 (2010)

    Google Scholar 

  10. Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the web of data. In: WWW 2010. ACM (2010)

    Google Scholar 

  11. Shokouhi, M.: Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 160–172. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  12. Shokouhi, M., Si, L.: Federated search. Foundations and Trends in Information Retrieval 5, 1–102 (2011)

    CrossRef  Google Scholar 

  13. Si, L., Callan, J.: Relevant document distribution estimation method for resource selection. In: SIGIR 2003. ACM (2003)

    Google Scholar 

  14. Si, L., Jin, R., Callan, J., Ogilvie, P.: A language modeling framework for resource selection and results merging. In: CIKM 2002. ACM (2002)

    Google Scholar 

  15. Thomas, P., Shokouhi, M.: SUSHI: scoring scaled samples for server selection. In: SIGIR 2009. ACM (2009)

    Google Scholar 

  16. Voorhees, E.: Overview of the TREC 2004 question answering track. In: TREC 2004. NIST (2005)

    Google Scholar 

  17. Xu, J., Croft, W.B.: Cluster-based language models for distributed retrieval. In: SIGIR 1999. ACM (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Balog, K., Neumayer, R., Nørvåg, K. (2012). Collection Ranking and Selection for Federated Entity Search. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34109-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34108-3

  • Online ISBN: 978-3-642-34109-0

  • eBook Packages: Computer ScienceComputer Science (R0)