Skip to main content

Formal Grammar for Hispanic Named Entities Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5449))

Abstract

A task that has been widely studied in the field of natural language processing is the Named Entity Recognition (NER). A great number of approaches have been developed to deal with the identification and classification of named entity strings in specific- and open-domains. Nevertheless, external modules have to be incorporated into many of the NER systems in order to solve the interpretation problems derived from proper nouns. In this article our focus will be on the study of ambiguity in Hispanic Nominal Sequences which constitution assumes three main problems: (1) the association of given names and/or surnames; (2) the composition of such elements by means of a connector; (3) and the duality of given name/surname. In order to analyze the magnitude of the problem, two gazetteers were made, one with 93998 given names and the other with 13779 surnames. The gazetteers entries were used as terminal symbols of the proposed grammar to determine the valid interpretations in the nominal sequences; this is done by means of an automatic labeling of all the elements the nominal sequences are made of.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dale, R., Mazur, P.: Handling conjunctions in named entities. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 131–142. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Babych, B., Hartley, A.: Improving machine translation quality with automatic named entity recognition. In: Proceedings of the 7th International EAMT 2003, Budapest, Hungary, pp. 1–8 (2003)

    Google Scholar 

  3. Huang, F.: Multilingual Named Entity Extraction and Translation from Text and Speech. PhD thesis, Carnegie Mellon University (2005)

    Google Scholar 

  4. Grover, C., Gearailt, D., Karkaletsis, V., Farmakiotou, D., Pazienza, M., Vindigni, M.: Multilingual xml-based named entity recognition. In: Proceedings of the International Conference on Language Resources and Evaluation LREC 2002, pp. 1060–1067 (2002)

    Google Scholar 

  5. Yun, B.-H.: HMM-based korean named entity recognition for information extraction. In: Zhang, Z., Siekmann, J.H. (eds.) KSEM 2007. LNCS, vol. 4798, pp. 526–531. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Ramesh, G.: From named entity recognition and disambiguation to relation extraction – learning for information extraction. In: Proceedings of National Conference on Research Prospects in Knowledge Mining NCKM 2008, Tamil Nadu, India (2008)

    Google Scholar 

  7. Paik, W., Liddy, E.D., Yu, E., McKenna, M.: Categorizing and standardizing proper nouns for efficient information retrieval. In: Proceedings of the Workshop on Acquisition of Lexical Knowledge from Text, Ohio, USA, pp. 154–160 (1993)

    Google Scholar 

  8. Thompson, P., Dozier, C.: Name searching and information retrieval. In: Proceedings of 2nd Conference on Empirical Methods in Natural Language Processing EMNLP 1997, Rhode Island, USA, pp. 134–140 (1997)

    Google Scholar 

  9. Mollá, D., van Zaanen, M., Smith, D.: Named entity recognition for question answering. In: Proceedings of Australasian Language Technology Workshop ALTW 2006, Sydney, Australia, pp. 51–58 (2006)

    Google Scholar 

  10. Noguera, E., Toral, A., Llopis, F., Muńoz, R.: Reducing question answering input data using named entity recognition. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS, vol. 3658, pp. 428–434. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Nadeau, D., Turney, P., Matwin, S.: Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. In: Proceedings of the 19th Canadian Conference on Artificial Intelligence CAI 2006, Quebec City, Canada, pp. 266–277 (2006)

    Google Scholar 

  12. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the 16th National Conference on Artificial Intelligence AAAI 1999, Florida, USA, pp. 474–479 (1999)

    Google Scholar 

  13. Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics COLING 1992, Nantes, France, pp. 539–545 (1992)

    Google Scholar 

  14. Etzioni, O., Cafarella, M., Downey, D., Popescu, A., Shaked, T., Soderl, S., Weld, D., Yates, E.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165, 91–134 (2005)

    Article  Google Scholar 

  15. Lin, D., Pantel, P.: Induction of semantic classes from natural language text. In: Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining SIGKDD 2001, California, USA, pp. 317–322 (2001)

    Google Scholar 

  16. Kozareva, Z., Ferreira, J., Gamallo, P., Pereira, G.: Cluster analysis of named entities. In: Proceedings of the International Conference on Intelligent Information Processing and Web Mining IIS: IIPWM 2004, Zakopane, Poland, pp. 429–433 (2004)

    Google Scholar 

  17. Kubala, F., Schwartz, R., Stone, R., Weischedel, R.: Named entity extraction from speech. In: Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Virginia, USA, pp. 287–292 (1998)

    Google Scholar 

  18. Cohen, W.: Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods. In: Proceedings of the 10th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining KDD 2004, Washington, USA, pp. 89–98 (2004)

    Google Scholar 

  19. Sigletos, G., Paliouras, G., Spyropoulos, C.D., Hatzopoulos, M.: Mining web sites using wrapper induction, named entities, and post-processing. In: Berendt, B., Hotho, A., Mladenič, D., van Someren, M., Spiliopoulou, M., Stumme, G. (eds.) EWMF 2003. LNCS, vol. 3209, pp. 97–112. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barceló, G., Cendejas, E., Sidorov, G., Bolshakov, I.A. (2009). Formal Grammar for Hispanic Named Entities Analysis. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00382-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00381-3

  • Online ISBN: 978-3-642-00382-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics