Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis

  • Yaakov HaCohen-Kerner
  • Ariel Kass
  • Ariel Peretz
Conference paper

DOI: 10.1007/978-3-540-69858-6_5

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5039)
Cite this paper as:
HaCohen-Kerner Y., Kass A., Peretz A. (2008) Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis. In: Kapetanios E., Sugumaran V., Spiliopoulou M. (eds) Natural Language and Information Systems. NLDB 2008. Lecture Notes in Computer Science, vol 5039. Springer, Berlin, Heidelberg

Abstract

Abbreviations are very common and are widely used in both written and spoken language. However, they are not always explicitly defined and in many cases they are ambiguous. In this research, we present a process that attempts to solve the problem of abbreviation ambiguity. Various features have been explored, including context-related methods and statistical methods. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Various variants of the one sense per discourse hypothesis (by varying the scope of discourse) have been implemented. Several common machine learning methods have been tested to find a successful integration of these variants. The best results have been achieved by SVM, with 96.09% accuracy.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yaakov HaCohen-Kerner
    • 1
  • Ariel Kass
    • 1
  • Ariel Peretz
    • 1
  1. 1.Department of Computer ScienceJerusalem College of Technology (Machon Lev)JerusalemIsrael

Personalised recommendations