Skip to main content

On the Modeling of Entities for Ad-Hoc Entity Search in the Web of Data

  • Conference paper
Advances in Information Retrieval (ECIR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

Abstract

The Web of Data describes objects, entities, or “things” in terms of their attributes and their relationships, using RDF statements. There is a need to make this wealth of knowledge easily accessible by means of keyword search. Despite recent research efforts in this direction, there is a lack of understanding of how structured semantic data is best represented for text-based entity retrieval. The task we are addressing in this paper is ad-hoc entity search: the retrieval of RDF resources that represent an entity described in the keyword query. We build upon and formalise existing entity modeling approaches within a generative language modeling framework, and compare them experimentally using a standard test collection, provided by the Semantic Search Challenge evaluation series. We show that these models outperform the current state-of-the-art in terms of retrieval effectiveness, however, this is done at the cost of abandoning a large part of the semantics behind the data. We propose a novel entity model capable of preserving the semantics associated with entities, without sacrificing retrieval effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balog, K., Azzopardi, L., de Rijke, M.: A language modeling framework for expert finding. Inf. Process. Manage. 45, 1–19 (2009)

    Article  Google Scholar 

  2. Balog, K., de Vries, A.P., Serdyukov, P., Thomas, P., Westerveld, T.: Overview of the TREC 2009 entity track. In: Proc. of the 18th Text Retrieval Conference, TREC 2009 (2010)

    Google Scholar 

  3. Balog, K., Ciglan, M., Neumayer, R., Wei, W., Nørvåg, K.: NTNU at SemSearch 2011. In: Proc. of the 4th Intl. Semantic Search Workshop (2011)

    Google Scholar 

  4. Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S., Duc, T.T.: Entity search evaluation over structured web data. In: Proc. of the 1st International Workshop on Entity-Oriented Search (EOS 2011), pp. 65–71 (2011)

    Google Scholar 

  5. Blanco, R., Mika, P., Vigna, S.: Effective and efficient entity search. In: Proc. of the 9th Intl. Semantic Web Conf., pp. 83–97 (2011)

    Google Scholar 

  6. Campinas, S., Delbru, R., Rakhmawati, N.A., Ceccarelli, D., Tummarello, G.: Sindice BM25F at SemSearch 2011. In: Proc. of the 4th Intl. Semantic Search Workshop (2011)

    Google Scholar 

  7. Craswell, N., De Vries, A.P., Soboroff, I.: The TREC 2005 enterprise track. In: Proc. of the 14th Text Retrieval Conference (TREC 2005) (2005)

    Google Scholar 

  8. Dalton, J., Huston, S.: Semantic entity retrieval using web queries over structured RDF data. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)

    Google Scholar 

  9. de Vries, A.P., Vercoustre, A.-M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 Entity Ranking Track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 245–251. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  10. Delbru, R., Toupikov, N., Catasta, M., Tummarello, G.: A Node Indexing Scheme for Web Entity Retrieval. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 240–256. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Demartini, G., de Vries, A.P., Iofciu, T., Zhu, J.: Overview of the INEX 2008 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 243–252. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Demartini, G., Firan, C.S., Iofciu, T., Krestel, R., Nejdl, W.: Why finding entities in Wikipedia is difficult, sometimes. Information Retrieval 13(5), 534–567 (2010)

    Article  Google Scholar 

  13. Demartini, G., Kärger, P., Papadakis, G., Fankhauser, P.: L3S Research Center at the SemSearch 2010 Evaluation for Entity Search Track. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010b)

    Google Scholar 

  14. Elbassuoni, S., Ramanath, M., Schenkel, R., Sydow, M., Weikum, G.: Language-model-based ranking for queries on RDF-graphs. In: Proc. of the 18th Intl. Conf. on Information and Knowledge Management, pp. 977–986 (2009)

    Google Scholar 

  15. Halpin, H., Herzig, D.M., Mika, P., Blanco, R., Pound, J., Thompson, H.S., Tran, D.T.: Evaluating ad-hoc object retrieval. In: Proc. of the Intl. Workshop on Evaluation of Semantic Technologies (2010)

    Google Scholar 

  16. Herzig, D.M., Tran, T.D.: Scoring model for entity search on RDF graphs. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)

    Google Scholar 

  17. Kamps, J., Geva, S., Trotman, A., Woodley, A., Koolen, M.: Overview of the INEX 2008 Ad Hoc Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 1–28. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  18. Kim, J., Xue, X., Croft, W.B.: A Probabilistic Retrieval Model for Semistructured Data. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 228–239. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Liu, X., Fang, H.: A study of entity search in semantic search workshop. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)

    Google Scholar 

  20. Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proc. of the 26th Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 143–150 (2003)

    Google Scholar 

  21. Ogilvie, P., Callan, J.: Hierarchical Language Models for XML Component Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 224–237. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  22. Park, J., Lee, S.-G.: Keyword search in relational databases. Knowl. Inf. Syst. 26, 175–193 (2011) ISSN: 0219-1377

    Article  Google Scholar 

  23. Pérez-Agüera, J.R., Arroyo, J., Greenberg, J., Iglesias, J.P., Fresno, V.: Using BM25F for semantic search. In: Proc. of the 3rd Intl. Semantic Search Workshop, pp. 1–8 (2010)

    Google Scholar 

  24. Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the web of data. In: Proc. of the 19th Intl. Conf. on World Wide Web, pp. 771–780 (2010)

    Google Scholar 

  25. Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields. In: Proc. of the 13th Intl. Conf. on Information and Knowledge Management, pp. 42–49 (2004)

    Google Scholar 

  26. Zhai, C.: Statistical language models for information retrieval a critical review. Foundations and Trends in Information Retrieval 2, 137–213 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neumayer, R., Balog, K., Nørvåg, K. (2012). On the Modeling of Entities for Ad-Hoc Entity Search in the Web of Data. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28997-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28996-5

  • Online ISBN: 978-3-642-28997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics