Abstract
The Web of Data describes objects, entities, or “things” in terms of their attributes and their relationships, using RDF statements. There is a need to make this wealth of knowledge easily accessible by means of keyword search. Despite recent research efforts in this direction, there is a lack of understanding of how structured semantic data is best represented for text-based entity retrieval. The task we are addressing in this paper is ad-hoc entity search: the retrieval of RDF resources that represent an entity described in the keyword query. We build upon and formalise existing entity modeling approaches within a generative language modeling framework, and compare them experimentally using a standard test collection, provided by the Semantic Search Challenge evaluation series. We show that these models outperform the current state-of-the-art in terms of retrieval effectiveness, however, this is done at the cost of abandoning a large part of the semantics behind the data. We propose a novel entity model capable of preserving the semantics associated with entities, without sacrificing retrieval effectiveness.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Balog, K., Azzopardi, L., de Rijke, M.: A language modeling framework for expert finding. Inf. Process. Manage. 45, 1–19 (2009)
Balog, K., de Vries, A.P., Serdyukov, P., Thomas, P., Westerveld, T.: Overview of the TREC 2009 entity track. In: Proc. of the 18th Text Retrieval Conference, TREC 2009 (2010)
Balog, K., Ciglan, M., Neumayer, R., Wei, W., Nørvåg, K.: NTNU at SemSearch 2011. In: Proc. of the 4th Intl. Semantic Search Workshop (2011)
Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S., Duc, T.T.: Entity search evaluation over structured web data. In: Proc. of the 1st International Workshop on Entity-Oriented Search (EOS 2011), pp. 65–71 (2011)
Blanco, R., Mika, P., Vigna, S.: Effective and efficient entity search. In: Proc. of the 9th Intl. Semantic Web Conf., pp. 83–97 (2011)
Campinas, S., Delbru, R., Rakhmawati, N.A., Ceccarelli, D., Tummarello, G.: Sindice BM25F at SemSearch 2011. In: Proc. of the 4th Intl. Semantic Search Workshop (2011)
Craswell, N., De Vries, A.P., Soboroff, I.: The TREC 2005 enterprise track. In: Proc. of the 14th Text Retrieval Conference (TREC 2005) (2005)
Dalton, J., Huston, S.: Semantic entity retrieval using web queries over structured RDF data. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)
de Vries, A.P., Vercoustre, A.-M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 Entity Ranking Track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 245–251. Springer, Heidelberg (2008)
Delbru, R., Toupikov, N., Catasta, M., Tummarello, G.: A Node Indexing Scheme for Web Entity Retrieval. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 240–256. Springer, Heidelberg (2010)
Demartini, G., de Vries, A.P., Iofciu, T., Zhu, J.: Overview of the INEX 2008 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 243–252. Springer, Heidelberg (2009)
Demartini, G., Firan, C.S., Iofciu, T., Krestel, R., Nejdl, W.: Why finding entities in Wikipedia is difficult, sometimes. Information Retrieval 13(5), 534–567 (2010)
Demartini, G., Kärger, P., Papadakis, G., Fankhauser, P.: L3S Research Center at the SemSearch 2010 Evaluation for Entity Search Track. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010b)
Elbassuoni, S., Ramanath, M., Schenkel, R., Sydow, M., Weikum, G.: Language-model-based ranking for queries on RDF-graphs. In: Proc. of the 18th Intl. Conf. on Information and Knowledge Management, pp. 977–986 (2009)
Halpin, H., Herzig, D.M., Mika, P., Blanco, R., Pound, J., Thompson, H.S., Tran, D.T.: Evaluating ad-hoc object retrieval. In: Proc. of the Intl. Workshop on Evaluation of Semantic Technologies (2010)
Herzig, D.M., Tran, T.D.: Scoring model for entity search on RDF graphs. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)
Kamps, J., Geva, S., Trotman, A., Woodley, A., Koolen, M.: Overview of the INEX 2008 Ad Hoc Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 1–28. Springer, Heidelberg (2009)
Kim, J., Xue, X., Croft, W.B.: A Probabilistic Retrieval Model for Semistructured Data. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 228–239. Springer, Heidelberg (2009)
Liu, X., Fang, H.: A study of entity search in semantic search workshop. In: Proc. of the 3rd Intl. Semantic Search Workshop (2010)
Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proc. of the 26th Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 143–150 (2003)
Ogilvie, P., Callan, J.: Hierarchical Language Models for XML Component Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 224–237. Springer, Heidelberg (2005)
Park, J., Lee, S.-G.: Keyword search in relational databases. Knowl. Inf. Syst. 26, 175–193 (2011) ISSN: 0219-1377
Pérez-Agüera, J.R., Arroyo, J., Greenberg, J., Iglesias, J.P., Fresno, V.: Using BM25F for semantic search. In: Proc. of the 3rd Intl. Semantic Search Workshop, pp. 1–8 (2010)
Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the web of data. In: Proc. of the 19th Intl. Conf. on World Wide Web, pp. 771–780 (2010)
Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields. In: Proc. of the 13th Intl. Conf. on Information and Knowledge Management, pp. 42–49 (2004)
Zhai, C.: Statistical language models for information retrieval a critical review. Foundations and Trends in Information Retrieval 2, 137–213 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Neumayer, R., Balog, K., Nørvåg, K. (2012). On the Modeling of Entities for Ad-Hoc Entity Search in the Web of Data. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-28997-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28996-5
Online ISBN: 978-3-642-28997-2
eBook Packages: Computer ScienceComputer Science (R0)