Web People Search with Domain Ranking

  • Zornitsa Kozareva
  • Rumen Moraliyski
  • Gaël Dias
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5246)


The world wide web is the biggest information source which people consult daily for facts and events. Studies demonstrate that 30% of the searches relate to proper names such as organizations, actors, singers, books or movie titles. However, a serious problem is posed by the high level of ambiguity where one and the same name can be shared by different individuals or even across different proper name categories. In order to provide faster and more relevant access to the requested information, current research focuses on the clustering of web pages related to the same individual. In this paper, we focus on the resolution of the web people search problem through the integration of domain information.


Noun Phrase Computational Linguistics Domain Information Name Entity Semantic Evaluation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Javier, A., Gonzalo, J., Sekine, S.: The semeval-2007 weps evaluation: Establishing a benchmark for the web people search task. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 64–69 (2007)Google Scholar
  2. 2.
    Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and Seventeenth International Conference on Computational Linguistics, pp. 79–85 (1998)Google Scholar
  3. 3.
    Pedersen, T., Purandare, A., Kulkarni, A.: Name discrimination by clustering similar contexts. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 226–237. Springer, Heidelberg (2005)Google Scholar
  4. 4.
    Kozareva, Z., Vázquez, S., Montoyo, A.: Multilingual name disambiguation with semantic information. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 23–30. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Pedersen, T., Kulkarni, A.: Unsupervised discrimination of person names in web contexts. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 299–310. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Kozareva, Z., Vázquez, S., Montoyo, A.: Discovering the underlying meanings and categories of a name through domain and semantic information. In: Proceedings of the Conference on Recent Advances in Natural Language Processing RANLP (2007)Google Scholar
  7. 7.
    Chen, Y., Martin, J.H.: Cu-comsem: Exploring rich features for unsupervised web personal name disambiguation. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 125–128 (2007)Google Scholar
  8. 8.
    Popescu, O., Magnini, B.: Irst-bp: Web people search using name entities. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 195–198 (2007)Google Scholar
  9. 9.
    Agirre, E., Soroa, A.: Ubc-as: A graph based unsupervised system for induction and classification. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 346–349 (2007)Google Scholar
  10. 10.
    Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation, pp. 1413–1418 (2000)Google Scholar
  11. 11.
    Esuli, A., Sebastiani, F.: Pageranking wordnet synsets: An application to opinion mining. In: Proceedings of ACL-2007, the 45th Annual Meeting of the Association of Computational Linguistics, pp. 424–431 (2007)Google Scholar
  12. 12.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)CrossRefGoogle Scholar
  13. 13.
    Liu, H.: Montylingua: An end-to-end natural language processor with common sense (2004), http://web.media.mit.edu/~hugo/montylingua
  14. 14.
    Cleuziou, G., Martin, L., Vrain, C.: Poboc: an overlapping clustering algorithm. In: Application to rule-based classification and textual data, pp. 440–444 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Zornitsa Kozareva
    • 1
  • Rumen Moraliyski
    • 2
  • Gaël Dias
    • 2
  1. 1.University of AlicanteSpain
  2. 2.University of Beira InteriorCovilhãPortugal

Personalised recommendations