Meaning Inference of Abbreviations Appearing in Clinical Studies

  • Efthymios Chondrogiannis
  • Vassiliki Andronikou
  • Efstathios Karanastasis
  • Theodora Varvarigou
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 563)


The number of publicly available clinical studies is constantly increasing, formulating a rather promising corpus of documents for clinical research purposes. However, the abbreviations used in these studies pose a serious barrier to any text mining technique. This paper presents a study conducted in the above domain, which used specifically developed tools and mechanisms in order to process a number of randomly selected documents from The analysis performed indicated that abbreviations appear at a large scale without their long form (aka expansion). In order to assess the abbreviations’ true meaning, it is necessary to utilize the appropriate corpus of documents, apply innovative algorithms and techniques to detect their possible expansions, and accordingly select the appropriate ones. Furthermore, the discrimination power of tokens has a distinctive role in abbreviations construction, and hence, it can facilitate the detection of acronym-type abbreviations. Additionally, the expressions in which abbreviations appear, as well as the preceding or following text are of primary importance for selecting the appropriate meaning.


Abbreviations Expansion Clinical studies Semantic analysis Corpus annotation 



This work is being supported by the OpenScienceLink project [8] and has been partially funded by the European Commission’s CIP-PSP under contract number 325101. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.


  1. 1.
    Gale, W.A., Church, K.W., Yarowsky, D.: One sense per discourse. In: Proceedings of the Workshop on Speech and Natural Language HLT 1991, pp. 233–237. New York (1992)Google Scholar
  2. 2.
    Schwartz, S.A., Hearst, A.M.: A Simple algorithm for identifying abbreviation definitions in biomedical text. In: Proccedings of PSB, pp. 451–462 (2003)Google Scholar
  3. 3.
    EU Clinical Trials Register.
  4. 4.
    Porter, M.F.: An algorithm for suffix stripping. Program 40(3), 211–218 (2006)CrossRefGoogle Scholar
  5. 5.
  6. 6.
    Medical Subject Headings (MeSH).
  7. 7.
  8. 8.
    Karanastasis, E., Andronikou, V., Chondrogiannis, E., Tsatsaronis, G., Eisinger, D., Petrova, A.: The OpenScienceLink architecture for novel services exploiting open access data in the biomedical domain. In: Proceedings of PCI 2014, pp. 28:1–28:6. ACM, New York (2014)Google Scholar
  9. 9.
    Xu, Y., Wang, Z., Lei, Y., Zhao, Y., Xue, Y.: MBA: a literature mining system for extracting biomedical abbreviations. BMC Bioinform. 10, 14 (2009)CrossRefGoogle Scholar
  10. 10.
    McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Finding predominant word senses in untagged text. In: Proceedings of ACL 2004, Stroudsburg, PA, USA, pp. 280–287 (2004)Google Scholar
  11. 11.
    Stevenson, M., Guo, Y., Amri, A.A., Gaizauskas, R.: Disambiguation of biomedical abbreviations. In: Proceedings of BioNLP 2009, Boulder, Colorado, USA, pp. 71–79 (2009)Google Scholar
  12. 12.
    McInnes, B.T., Pedersen, T., Carlis, J.: Using UMLS concept unique identifiers (CUIs) for word sense disambiguation in the biomedical domain. In: AMIA 2007, pp. 533–537 (2007)Google Scholar
  13. 13.
    CT abbreviations-annotated corpus.
  14. 14.
    Chang, J.T., Schütze, H., Altman, R.B.: Creating an online dictionary of abbreviations from MEDLINE. J. Am. Med. Inform. Assoc. 9(6), 612–620 (2002)CrossRefGoogle Scholar
  15. 15.
    Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M.: Automatic extraction of acronym-meaning pairs from MEDLINE databases. Stud. Health Tech. I. 84(1), 371–375 (2001)Google Scholar
  16. 16.
    Zhou, W., Torvik, V.I., Smalheiser, N.R.: ADAM: another database of abbreviations in MEDLINE. Bioinformatics 22(22), 2813–2818 (2006)CrossRefGoogle Scholar
  17. 17.
    Park, Y., Byrd, R.J.: Hybrid text mining for finding abbreviations and their definitions. In: Proceedings of EMNLP 2001 Conference, pp. 126–133 (2001)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Efthymios Chondrogiannis
    • 1
  • Vassiliki Andronikou
    • 1
  • Efstathios Karanastasis
    • 1
  • Theodora Varvarigou
    • 1
  1. 1.National Technical University of AthensAthensGreece

Personalised recommendations