Formal Grammar for Hispanic Named Entities Analysis

Barceló, Grettel; Cendejas, Eduardo; Sidorov, Grigori; Bolshakov, Igor A.

doi:10.1007/978-3-642-00382-0_15

Formal Grammar for Hispanic Named Entities Analysis

Grettel Barceló¹⁷,
Eduardo Cendejas¹⁷,
Grigori Sidorov¹⁷ &
…
Igor A. Bolshakov¹⁷

Conference paper

1764 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5449))

Abstract

A task that has been widely studied in the field of natural language processing is the Named Entity Recognition (NER). A great number of approaches have been developed to deal with the identification and classification of named entity strings in specific- and open-domains. Nevertheless, external modules have to be incorporated into many of the NER systems in order to solve the interpretation problems derived from proper nouns. In this article our focus will be on the study of ambiguity in Hispanic Nominal Sequences which constitution assumes three main problems: (1) the association of given names and/or surnames; (2) the composition of such elements by means of a connector; (3) and the duality of given name/surname. In order to analyze the magnitude of the problem, two gazetteers were made, one with 93998 given names and the other with 13779 surnames. The gazetteers entries were used as terminal symbols of the proposed grammar to determine the valid interpretations in the nominal sequences; this is done by means of an automatic labeling of all the elements the nominal sequences are made of.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dale, R., Mazur, P.: Handling conjunctions in named entities. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 131–142. Springer, Heidelberg (2007)
Chapter Google Scholar
Babych, B., Hartley, A.: Improving machine translation quality with automatic named entity recognition. In: Proceedings of the 7th International EAMT 2003, Budapest, Hungary, pp. 1–8 (2003)
Google Scholar
Huang, F.: Multilingual Named Entity Extraction and Translation from Text and Speech. PhD thesis, Carnegie Mellon University (2005)
Google Scholar
Grover, C., Gearailt, D., Karkaletsis, V., Farmakiotou, D., Pazienza, M., Vindigni, M.: Multilingual xml-based named entity recognition. In: Proceedings of the International Conference on Language Resources and Evaluation LREC 2002, pp. 1060–1067 (2002)
Google Scholar
Yun, B.-H.: HMM-based korean named entity recognition for information extraction. In: Zhang, Z., Siekmann, J.H. (eds.) KSEM 2007. LNCS, vol. 4798, pp. 526–531. Springer, Heidelberg (2007)
Chapter Google Scholar
Ramesh, G.: From named entity recognition and disambiguation to relation extraction – learning for information extraction. In: Proceedings of National Conference on Research Prospects in Knowledge Mining NCKM 2008, Tamil Nadu, India (2008)
Google Scholar
Paik, W., Liddy, E.D., Yu, E., McKenna, M.: Categorizing and standardizing proper nouns for efficient information retrieval. In: Proceedings of the Workshop on Acquisition of Lexical Knowledge from Text, Ohio, USA, pp. 154–160 (1993)
Google Scholar
Thompson, P., Dozier, C.: Name searching and information retrieval. In: Proceedings of 2nd Conference on Empirical Methods in Natural Language Processing EMNLP 1997, Rhode Island, USA, pp. 134–140 (1997)
Google Scholar
Mollá, D., van Zaanen, M., Smith, D.: Named entity recognition for question answering. In: Proceedings of Australasian Language Technology Workshop ALTW 2006, Sydney, Australia, pp. 51–58 (2006)
Google Scholar
Noguera, E., Toral, A., Llopis, F., Muńoz, R.: Reducing question answering input data using named entity recognition. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS, vol. 3658, pp. 428–434. Springer, Heidelberg (2005)
Chapter Google Scholar
Nadeau, D., Turney, P., Matwin, S.: Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. In: Proceedings of the 19th Canadian Conference on Artificial Intelligence CAI 2006, Quebec City, Canada, pp. 266–277 (2006)
Google Scholar
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the 16th National Conference on Artificial Intelligence AAAI 1999, Florida, USA, pp. 474–479 (1999)
Google Scholar
Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics COLING 1992, Nantes, France, pp. 539–545 (1992)
Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Popescu, A., Shaked, T., Soderl, S., Weld, D., Yates, E.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165, 91–134 (2005)
Article Google Scholar
Lin, D., Pantel, P.: Induction of semantic classes from natural language text. In: Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining SIGKDD 2001, California, USA, pp. 317–322 (2001)
Google Scholar
Kozareva, Z., Ferreira, J., Gamallo, P., Pereira, G.: Cluster analysis of named entities. In: Proceedings of the International Conference on Intelligent Information Processing and Web Mining IIS: IIPWM 2004, Zakopane, Poland, pp. 429–433 (2004)
Google Scholar
Kubala, F., Schwartz, R., Stone, R., Weischedel, R.: Named entity extraction from speech. In: Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Virginia, USA, pp. 287–292 (1998)
Google Scholar
Cohen, W.: Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods. In: Proceedings of the 10th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining KDD 2004, Washington, USA, pp. 89–98 (2004)
Google Scholar
Sigletos, G., Paliouras, G., Spyropoulos, C.D., Hatzopoulos, M.: Mining web sites using wrapper induction, named entities, and post-processing. In: Berendt, B., Hotho, A., Mladenič, D., van Someren, M., Spiliopoulou, M., Stumme, G. (eds.) EWMF 2003. LNCS, vol. 3209, pp. 97–112. Springer, Heidelberg (2004)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City, Mexico
Grettel Barceló, Eduardo Cendejas, Grigori Sidorov & Igor A. Bolshakov

Authors

Grettel Barceló
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Cendejas
View author publications
You can also search for this author in PubMed Google Scholar
Grigori Sidorov
View author publications
You can also search for this author in PubMed Google Scholar
Igor A. Bolshakov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barceló, G., Cendejas, E., Sidorov, G., Bolshakov, I.A. (2009). Formal Grammar for Hispanic Named Entities Analysis. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-00382-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00381-3
Online ISBN: 978-3-642-00382-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics