A Classification Framework for Disambiguating Web People Search Result Using Feedback

  • Ou Jin
  • Shenghua Bao
  • Zhong Su
  • Yong Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6897)


This paper is concerned with the problem of disambiguating Web people search result. Finding the information about people is one of the most common activities on the Web. However, the result of searching person names suffers a lot from the problem of ambiguity. In this paper, we propose a classification framework to solve this problem using an additional feedback page. Compared with the traditional solution which clusters the search result, our framework has lower computational complexity and better effect. we also developed two new features under the framework, which utilized the information beyond tokens. Experiments show that the performance can be improved greatly using the two features. Different classification methods are also compared for their effectiveness for the task.


Support Vector Machine Search Result Relevance Feedback Cosine Similarity Result Page 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Artiles, J., Gonzalo, J., Verdejo, F.: A testbed for people searching strategies in the www. In: In Proceedings of the 28th Annual International ACM SIGIR Conference, SIGIR 2005 (2005)Google Scholar
  2. 2.
    Artiles, J., Sekine, S., Gonzalo, J.: Web people search: results of the first evaluation and the plan for the second. In: WWW 2008: Proceeding of the 17th International Conference on World Wide Web, pp. 1071–1072. ACM, New York (2008)CrossRefGoogle Scholar
  3. 3.
    Buckley, C., Salton, G.: Optimization of relevance feedback weights. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 351–357. ACM Press, New York (1995)Google Scholar
  4. 4.
    Buckley, C., Salton, G., Allan, J.: The effect of adding relevance information in a relevance feedback environment. In: Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994 (1994)Google Scholar
  5. 5.
    Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: Special issue on learning from imbalanced data. SIGKDD Explorations 6, 1–6 (2004)CrossRefGoogle Scholar
  6. 6.
    Chen, Y., Martin, J.: Towards robust unsupervised personal name disambiguation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (CoNLL 2007), pp. 190–198 (2007)Google Scholar
  7. 7.
    Han, H., Zha, H., Giles, C.L.: Name disambiguation in author citations using a k-way spectral clustering method. In: Proceedings of Joint Conference on Digital Libraries (JCDL 2005) (2005)Google Scholar
  8. 8.
    Kurita, T.: An efficient agglomerative clustering algorithm using a heap. Pattern Recognition 24(3), 205–209 (1991)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Mann, G., Yarowsky, D.: Unsupervised personal name disambiguation. In: Proceeding of the 2003 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, CoNLL 2003 (2003)Google Scholar
  10. 10.
    On, B.W., Lee, D., Kang, J., Mitra, P.: Comparative study of name disambiguation problem using a scalable blocking-based framework. In: Joint Conference on Digital Libraries, JCDL 2005 (2005)Google Scholar
  11. 11.
    Radlinski, F., Joachims, T.: Query chains: Learning to rank from implicit feedback. In: The Eleventh ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining, KDD 2005 (2005)Google Scholar
  12. 12.
    Rocchio, J.J.: Relevance feedback in information retrieval. In: The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323 (1971)Google Scholar
  13. 13.
    Wang, X., Fang, H., Zhai, C.: A study of methods for negative relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 219–226. ACM, Singapore (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ou Jin
    • 1
  • Shenghua Bao
    • 2
  • Zhong Su
    • 2
  • Yong Yu
    • 1
  1. 1.Shanghai Jiao Tong UniversityShanghaiChina
  2. 2.IBM China Research LabBeijingChina

Personalised recommendations