Abstract
In the named entity normalization task, a system identifies a canonical unambiguous referent for names like Bush or Alabama. Resolving synonymy and ambiguity of such names can benefit end-to-end information access tasks. We evaluate two entity normalization methods based on Wikipedia in the context of both passage and document retrieval for question anwering. We find that even a simple normalization method leads to improvements of early precision, both for document and passage retrieval. Moreover, better normalization results in better retrieval performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jakarta Lucene ext search engine (2002), http://lucene.apache.org
Ahn, D., Jijkoun, V., Mishne, G., Müller, K., de Rijke, M., Schlobach, S.: Using Wikipedia at the TREC QA Track. In: TREC 2004 (2005)
Artiles, J., Gonzalo, J., Sekine, S.: The SemEval-2007 WePS Evaluation: Establishing a benchmark for Web People Search Task. In: Semeval 2007 (2007)
Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. PhD thesis, New York University (1999)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL 2006 (2006)
Cohen, A.M.: Unsupervised gene/protein named entity normalization using automatically extracted dictionaries. In: ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases, pp. 17–24 (2005)
Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP-CoNLL 2007, pp. 708–716 (2007)
Farmakiotou, D., Karkaletsis, V., Koutsias, J., Sigletos, G., Spyropoulos, C., Stamatopoulos, P.: Rule-based named entity recognition for greek financial texts (2000)
Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL (2005)
Gabrilovich, E., Markovitch, S.: Overcoming the Brittleness Bottleneck using Wikipedia. In: AAAI 2006 (2006)
Magdy, W., Darwish, K., Emam, O., Hassan, H.: Arabic cros-document person name normalization. In: CASL Workshop 2007, pp. 25–32 (2007)
Monz, C.: Minimal span weighting retrieval for question answering. In: Proceedings of SIGIR 2004 Workshop on Information Retrieval for Question Answering (2004)
Solorio, T.: Improvement of Named Entity Tagging by Machine Learning. PhD thesis (2005)
Voorhees, E.M.: Overview of the trec 2003 question answering track. In: TREC, pp. 54–68 (2003)
Zhou, W., Yu, C., Smalheiser, N., Torvik, V., Hong, J.: Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature. In: SIGIR 2007, pp. 655–662 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khalid, M.A., Jijkoun, V., de Rijke, M. (2008). The Impact of Named Entity Normalization on Information Retrieval for Question Answering. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds) Advances in Information Retrieval. ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78646-7_83
Download citation
DOI: https://doi.org/10.1007/978-3-540-78646-7_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78645-0
Online ISBN: 978-3-540-78646-7
eBook Packages: Computer ScienceComputer Science (R0)