Named Entity Recognition and Normalization: A Domain-Specific Language Approach

  • Miguel Vazquez
  • Monica Chagoyen
  • Alberto Pascual-Montano
Part of the Advances in Soft Computing book series (AINSC, volume 49)


We present, RNer, a tool that performs Named Entity Recognition and Normalization of gene and protein mentions on biomedical text. The tool we present not only offers a complete solution to the problem, but it does so by providing easily configurable framework, that abstracts the algorithmic details from the domain specific. Configuration and tuning for particular tasks is done using domain specific languages, clearer and more succinct, yet equally expressive that general purpose languages. An evaluation of the system is carried using the BioCreative datasets.


Regular Expression Conditional Random Field Name Entity Recognition Entity Recognition Biomedical Text 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chen, L., Liu, H., Friedman, C.: Gene name ambiguity of eukaryotic nomenclatures. Bioinformatics 21(2), 248–256 (2005)CrossRefGoogle Scholar
  2. 2.
    Hirschman, L., Colosimo, M., Morgan, A., Yeh, A.: Overview of BioCreAtIvE task 1B: normalized gene lists. BMC Bioinformatics 6(1), 11 (2005)CrossRefGoogle Scholar
  3. 3.
    Jones, K.S., et al.: A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1), 11–21 (1972)CrossRefGoogle Scholar
  4. 4.
    Kudo, T.: Crf++: Yet another crf toolkit (2005)Google Scholar
  5. 5.
    Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of the Eighteenth International Conference on Machine Learning table of contents, pp. 282–289 (2001)Google Scholar
  6. 6.
    Leaman, R., Gonzalez, G.: Banner: An Executable Survey Of Advance. In: Biomedical Named Entity Recognition. In: Pacific Symposium of Biocomputing (PSB) (2008)Google Scholar
  7. 7.
    Leser, U., Hakenberg, J.: What makes a gene name? Named entity recognition in the biomedical literature. Briefings in Bioinformatics 6(4), 357–369 (2005)CrossRefGoogle Scholar
  8. 8.
    Settles, B.: Abner: an open source tool for automatically tagging genes, proteins and other entity names in text (2005)Google Scholar
  9. 9.
    Settles, B., Collier, N., Ruch, P., Nazarenko, A.: Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets. In: COLING 2004 International Joint workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP) 2004, pp. 107–110 (2004)Google Scholar
  10. 10.
    Shatkay, H., Feldman, R.: Mining the Biomedical Literature in the Genomic Era: An Overview. Journal of Computational Biology 10(6), 821–855 (2003)CrossRefGoogle Scholar
  11. 11.
    Yeh, A., Morgan, A., Colosimo, M., Hirschman, L.: BioCreAtIvE task 1A: gene mention finding evaluation. BMC Bioinformatics 6, 1 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Miguel Vazquez
    • 1
  • Monica Chagoyen
    • 2
    • 3
  • Alberto Pascual-Montano
    • 2
  1. 1.Departamento de Ingeniería del Software e Inteligencia Artificial 
  2. 2.Dpto. Arquitectura de ComputadoresUniversidad Complutense de MadridMadridSpain
  3. 3.Biocomputing Unit, Centro Nacional de Biotecnología - CSICMadridSpain

Personalised recommendations