Towards Faceted Search for Named Entity Queries

  • Sofia Stamou
  • Lefteris Kozanidis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5731)


A considerable fraction of the web queries contain named entities. This, coupled with the fact that a proper name might refer to multiple entities, imposes the ever-increasing need that search engines handle efficiently named entity queries. In this paper, we present a technique that automatically identifies the distinct subject classes to which a named entity query might refer and selects a set of appropriate facets for denoting the query properties within every class. We also suggest a method that examines the distribution of the identified query facets within the contents of the query matching pages and groups search results according to their entity denotation types. Our preliminary study shows that our technique identifies useful facets for representing the named entity query properties in each of their referenced subject classes.


faceted search named entity queries Wikipedia corpus 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Li, X., Liu, B., Yu, P.: Mining Community Structure of Named Entities from Web Pages and Blogs. In: AAAI Spring Symposia on Computational Approaches to Analyzing Weblogs (2006)Google Scholar
  2. 2.
    Pasca, M.: Weakly-Supervised Discovery of Named Entities Using Web Search Queries. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (2007)Google Scholar
  3. 3.
    Cucerzan, S., Yarowsky, D.: Language Independent NER Using a Unified Model if Internal and Contextual Features. In: Proceedings of CoNLL Conference, pp. 171–174 (2002)Google Scholar
  4. 4.
    Klein, D., Smarr, J., Nguyen, H., Manning, C.D.: Named Entity Recognition with Character Level Models. In: Proceedings of the CoNLL Conference (2003)Google Scholar
  5. 5.
    Fleischman, M., Hovy, E.: Fine Grained Classification of Named Entities. In: Proceedings of the COLING Conference, pp. 267–273 (2002)Google Scholar
  6. 6.
    Florian, R., Ittycheriah, A., Jing, H., Zhang, T.: Named Entity Recognition through Classifier Combination. In: Proceedings of the CoNLL Conference, pp. 168–171 (2003)Google Scholar
  7. 7.
    Bunescu, R., Pasca, M.: Using Encyclopedic Knowledge for Named Entity Disambiguation. In: Proceedings of the EACL Conference, pp. 9–16 (2006)Google Scholar
  8. 8.
    Dakka, W., Cucerzan, S.: Augmenting Wikipedia with Named Entity Tags. In: Proceedings of the 3rd Intl. Joint Conference on Natural Language Processing (2008)Google Scholar
  9. 9.
    Watanabe, Y., Asahara, M., Matsumoto, Y.: A Graph-Based Approach to Named Entity Categorization in Wikipedia Using Conditional Random Fields. In: Proceedings of the EMNLP-CoNLL Conference, pp. 649–657 (2007)Google Scholar
  10. 10.
    Dakka, W., Dayal, R., Ipeirotis, P.: Automatic Discovery of Useful Facet Terms. In: Proceedings of the ACM SIGIR Workshop on Faceted Search (2006)Google Scholar
  11. 11.
    Dakka, W., Ipeirotis, P.: Automatic Extraction of Useful Facet Hierarchies form Text Databases. In: Proceedings of the ICDE Conference (2008)Google Scholar
  12. 12.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)MATHGoogle Scholar
  13. 13.
    Cucerzan, S.: Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In: Proceedings of the EMNLP Conference (2007)Google Scholar
  14. 14.
    Nguyen, B.V., Kan, M.-Y.: Functional Faceted Web Query Analysis. In: The WWW 2007 (2007)Google Scholar
  15. 15.
    Koren, J., Zhang, Y., Liu, X.: Personalized Interactive Faceted Search. In: The WWW 2008 (2008)Google Scholar
  16. 16.
    Tunkelang, D.: Dynamic Category Sets: An Approach for Faceted Search. In: Proceedings of the ACM SIGIR Workshop on Faceted Search (2006)Google Scholar
  17. 17.
    Di, N., Yao, C., Duan, M., Zhu, J.J.-H., Li, X.: Representing a Web page as Sets of Named Entities of Multiple Types – A Model and Some Preliminary Applications. In: Proceedings of the World Wide Web Conference (poster session), pp. 1099–1110 (2008)Google Scholar
  18. 18.
    Dash, D., Rao, J., Megoddo, N., Ailamaki, A., Lohman, G.: Dynamic Faceted Search for Discovery-Driven Analysis. In: Proceedings of the 17th Intl. ACM CIKM Conference (2008)Google Scholar
  19. 19.
    Gliozzo, A., Strapparava, C., Dagan, I.: Unsupervised and Supervised Exploitation of Semantic Domains in Lexical Disambiguation. Computer Speech and Language 18(3), 275–299 (2004)CrossRefGoogle Scholar
  20. 20.
    Haveliwala, T.: Topic Sensitive PageRank. In: Proc. of the 11th WWW Conference (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Sofia Stamou
    • 1
  • Lefteris Kozanidis
    • 1
  1. 1.Computer Engineering and Informatics DepartmentPatras UniversityGreece

Personalised recommendations