Neuroinformatics

, Volume 1, Issue 2, pp 177–192 | Cite as

Neuroanatomical term generation and comparison between two terminologies

  • Prashanti R. Srinivas
  • Daniel Gusfield
  • Oliver Mason
  • Michael Gertz
  • Michael Hogarth
  • James Stone
  • Edward G. Jones
  • Fredric A. Gorin
Original Article

Abstract

An approach and software tools are described for identifying and extracting compound terms (CTs), acronyms and their associated contexts from textual material that is associated with neuroanatomical atlases. A set of simple syntactic rules were appended to the output of a commercially available part of speech (POS) tagger (Qtag v 3.01) that extracts CTs and their associated context from the texts of neuroanatomical atlases. This “hybrid” parser appears to be highly sensitive and recognized 96% of the potentially germane neuroanatomical CTs and acronyms present in the cat and primate thalamic atlases.

A comparison of neuroanatomical CTs and acronyms between the cat and primate atlas texts was initially performed using exact-term matching. The implementation of string-matching algorithms significantly improved the identification of relevant terms and acronyms between the two domains. The End Gap Free string matcher identified 98% of CTs and the Needleman Wunsch (NW) string matcher matched 36% of acronyms between the two atlases.

Combining several simple grammatical and lexical rules with the POS tagger (“hybrid parser”) (1) extracted complex neuroanatomical terms and acronyms from selected cat and primate thalamic atlases and (2) and facilitated the semi-automated generation of a highly granular thalamic terminology. The implementation of string-matching algorithms (1) reconciled terminological errors generated by optical character recognition (OCR) software used to generate the neuroanatomical text information and (2) increased the sensitivity of matching neuroanatomical terms and acronyms between the two neuroanatomical domains that were generated by the “hybrid” parser.

Index Entries

Term similarity thalamic atlas neuroanatomical indexing information retrieval string matching statistical parser 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. American Heritage Dictionary of the English Language, The: Fourth Edition. 2000, Houghton-Mifflin, Boston, MA.Google Scholar
  2. Assadi, H. and Bourigault, D. (1996) Acquisition and modeling of knowledge starting from texts: data-processing tools and methodological elements. In: Acts of 10th Congress Pattern Recognition and Artificial Intelligence, Rennes, France.Google Scholar
  3. Berman, A. L. and Jones E. G. (1982) The Thalamus and Basal Telencephalon of the Cat. A Cytoarchitectonic Atlas with Stereotaxic Coordinates. University of Wisconsin Press, Madison, WI.Google Scholar
  4. Chang, J., Schutze, H., and Altman, R. (1999) Creating an online dictionary of abbreviations from MED-LINE. J. Am. Med. Inform. Assoc. 9:612–620.CrossRefGoogle Scholar
  5. Crasto, C., Marenco, L., Miller, P., and Shepherd, G. (2002) Olfactory receptor database: a metadata-driven automated population from sources of gene and protein sequences. Nucleic Acids Res. 30:354–360.CrossRefGoogle Scholar
  6. Gardner, D., Abato, M., Knuth, K. H., Debellis, R., and Gardner, E. P. (2001a) A functional ontology for neuroinformatics. The Human Brain Project/Neuroinformatics Annual Spring Meeting, May 21–22, 2001, Bethesda, MD.Google Scholar
  7. Gusfield, D. (1997) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge, UK.Google Scholar
  8. Jacquemin, C. and Bourigault, D. (2002) Termextraction and automatic indexing. In: Handbook of Computational Linguistics. (Mitkov, R., ed.) Oxford University Press, Oxford, UK, Chapter 19.Google Scholar
  9. Jones, E. G. (1998) The thalamus of primates In: Handbook of Chemical Neuroanatomy, Volume 14. (Bloom, F. E., et al., eds.) Elsevier, Amsterdam, The Netherlands.Google Scholar
  10. Kuang-Hua, C. and Chert, I. (1994) Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In: 32nd Annual Meeting of the Association for Computational Linguistics, June 27–30, New Mexico State University, Las Cruces, NM.Google Scholar
  11. Language Technology Group. http://www.ltg.ed.ac.uk/software/chunk/Google Scholar
  12. Lopresti, D. and Wilfong, G. (1999) Cross-domain approximate string matching. Sixth International Symposium on String Processing and Information Retrieval. Cancun, Mexico, September 22–24, pp. 120–127.Google Scholar
  13. Manning, C. D. and Schütze, H. (2000) Foundations of statistical natural language. MIT Press, Cambridge, MA, p. 83.Google Scholar
  14. Maynard, D. and Ananiadou, S. (1999) Identifying contextual information for multi-word term extraction, In: 5th International Congress on Terminology and Knowledge Engineering (TKE99), pp. 212–221.Google Scholar
  15. Monge, A. E. and Elkan, C. P. (1996) The field matching problem: Algorithms and applications. Second International Conference on Knowledge Discovery and Data Mining. (KDD96), Portland, OR, August 2–4, pp. 267–270.Google Scholar
  16. Penn Tree Bank. http://www.cis.upenn.edu/~treebank/home.htmlGoogle Scholar
  17. Qtag v 3.01, Portable POS Tagger. Oliver Mason, Department of English, School of Humanities, The University of Birmingham, UK. http://web.bham.ac.uk/O.Mason/Google Scholar
  18. SPECIALIST Lexicon. http://www.nlm.nih.gov/research/umls/META4.HTML#s4Google Scholar
  19. Zhu, J. J. and Ungar, L. H. (2000) String Edit Analysis for merging databases. Knowledge Discovery and Data Mining Workshop, August 20. Boston, MA. ACM SIG KDD, Jan 2001, Vol. 2., No, 2, p. 3.Google Scholar

Copyright information

© Humana Press Inc 2003

Authors and Affiliations

  • Prashanti R. Srinivas
    • 1
  • Daniel Gusfield
    • 2
  • Oliver Mason
    • 3
  • Michael Gertz
    • 2
  • Michael Hogarth
    • 4
  • James Stone
    • 1
  • Edward G. Jones
    • 1
  • Fredric A. Gorin
    • 1
    • 5
  1. 1.Center for NeuroscienceUniversity of California at DavisDavis
  2. 2.Department of Computer ScienceUniversity of California at DavisDavis
  3. 3.Department of English, School of HumanitiesThe University of BirminghamBirminghamUK
  4. 4.Departments of Pathology and Internal MedicineUniversity of California at DavisDavis
  5. 5.Department of Neurology, School of MedicineUniversity of California at DavisDavis

Personalised recommendations