Abstract
In this paper, we propose a method for name disambiguation. For a given set of names and documents we cluster the documents and map each cluster to the appropriate name. The proposed method incorporates an unsupervised metric for semantic similarity computation and a computationally low-cost clustering algorithm. We experimented with the data used in Web People Search Task of SemEval-2007, in which 16 different teams were participated. The proposed system has an equal performance compared to the officially best system.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Artiles, J., Gonzalo, J., Sekine, S.: The SemEval-2007 WePS Evaluation: Establishing a Benchmark for the Web People Search. In: Proc. ACL 4th International Workshop on Semantic Evaluations, SemEval 2007 (2007)
Bagga, B., Baldwin, B.: Entity-based Cross-document Coreferencing using the Vector Space Model. In: Proc. COLING (1998)
Duda, R., Stork, D., Hart, P.: Pattern Classification. John Wiley & Sons, Chichester (2000)
Gooi, H.C., Allan, J.: Cross-document Coreference on a Large Scale Corpus. In: Proc. HLT/NAACL (2004)
Guha, V.R., Garg, A.: Disambiguating People in Search. In: Proc. 13th World Wide Web Conference (2004)
Herbert, R., Goodenough, B.J.: Contextual Correlates of Synonymy. Communications of the ACM 8 (1965)
Iosif, E., Tegos, A., Pangos, A., Fosler-Lussier, E., Potamianos, A.: Unsupervised Combination of Metrics for Semantic Class Induction. In: Proc. Spoken Language Technology Workshop (2006)
Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice Hall. Upper Saddle River (2000)
Lewis, D.: Naive Bayes at Forty: The Independence Assumption in Information Retrieval. In: Proc. of European Conference on Machine Learning (1998)
Mann, S.G., Yarowsky, D.: Unsupervised Personal Name Disambiguation. In: Proc. CoNLL (2003)
Pargellis, A., Fosler-Lussier, E., Lee, C., Potamianos, A., Tsai, A.: Auto-Induced Semantic Classes. Speech Communication 43, 183–203 (2004)
Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Personal Name Resolution Crossover Documents by a Semantics-based Approach. IEICE Inf. and Syst. E89-D (2006)
Searle, R.J.: Proper Names. Mind 67, 166–173 (1958)
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1) (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Iosif, E. (2010). Unsupervised Web Name Disambiguation Using Semantic Similarity and Single-Pass Clustering. In: Konstantopoulos, S., Perantonis, S., Karkaletsis, V., Spyropoulos, C.D., Vouros, G. (eds) Artificial Intelligence: Theories, Models and Applications. SETN 2010. Lecture Notes in Computer Science(), vol 6040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12842-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-12842-4_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12841-7
Online ISBN: 978-3-642-12842-4
eBook Packages: Computer ScienceComputer Science (R0)