Skip to main content

Using Word Formation Rules to Extend MT Lexicons

Part of the Lecture Notes in Computer Science book series (LNAI,volume 2499)

Abstract

In the IBM LMT Machine Translation (MT) system, a built-in strategy provides lexical coverage of a particular subset of words that are not listed in its bilingual lexicons. The recognition and coding of these words and their transfer generation is based on a set of derivational morphological rules. A new utility extends unfound words of this type in an LMT-compatible format in an auxiliary bilingual lexical file to be subsequently merged into the core lexicons. What characterizes this approach is the use of morphological, semantic, and syntactic features for both analysis and transfer. The auxiliary lexical file (ALF) has to be revised before a merge into the core lexicons. This utility integrates a linguistics-based analysis and transfer rules with a corpus-based method of verifying or falsifying linguistic hypotheses against extensive document translation, which in addition yields statistics on frequencies of occurrence as well as local context.

Keywords

  • Machine Translation
  • Text Processing
  • Base Word
  • Syntactic Feature
  • Derivational Morphology

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/3-540-45820-4_7
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-3-540-45820-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Byrd, R.J., Klavans, J.L., Aronoff, M., Anshen, F.: Computer Methods for Morphological Analysis. In Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics. (1986) 120–127

    Google Scholar 

  2. Daciuk, J.: Treatment of Unknown Words. In Proceedings of Workshop on Implementing Automata WIA’99, Berlin: Springer Verlag LNCS Series Volume 2214. (2001) 71–80

    CrossRef  Google Scholar 

  3. Gdaniec, C. Manandise, E., McCord, M.: Derivational Morphology to the Rescue: How It Can Help Resolve Unfound Words in MT. In Proceedings, MT Summit VIII, Santiago. CD Edition, compiled by John Hutchins. (2001)

    Google Scholar 

  4. Hutchins, W. J., Somers, H.L.: An Introduction to Machine Translation. London: Academic Press. (1992)

    MATH  Google Scholar 

  5. Jäppinen, H. Ylilammi, M.: Associative Model of Morphological Analysis: An Empirical Inquiry. Computational Linguistics 12(4). (1986) 257–272

    Google Scholar 

  6. Klavans, J.L. Jacquemin, C., Tzoukermann, E.: A Natural Language Approach to Multi-Word Term Conflation. In Proceedings, DELOS Workshop on Cross-Language Information Retrieval, ETHZ, Zurich. (1997)

    Google Scholar 

  7. McCord, M.C., Bernth, A.: The LMT Transformational System. In D. Farwell, L Gerber, & E. Hovy (Eds.), Machine Translation and the Information Soup. Proceedings of the 3rd AMTA conference. Berlin: Springer. (1998) 344–354

    CrossRef  Google Scholar 

  8. McCord, M.C.: Slot Grammar: A system for simple construction of practical natural language grammars. In R. Studer (Ed.), Natural Language and Logic: International Scientific Symposium. Berlin: Springer. (1990) 118–145

    Google Scholar 

  9. McCord, M., Wolff, S.: The Lexicon and Morphology for LMT, IBM Research Division Research Report, RC 13403. (1988)

    Google Scholar 

  10. Sproat, R.W.: Morphology and Computation. Cambridge: MIT Press. (1992)

    Google Scholar 

  11. Woods, W.A.: Aggressive Morphology for Robust lexical Coverage. In Proceedings of the Sixth Applied Natural Language Processing Conference. (2000) 218–223

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gdaniec, C., Manandise, E. (2002). Using Word Formation Rules to Extend MT Lexicons. In: Richardson, S.D. (eds) Machine Translation: From Research to Real Users. AMTA 2002. Lecture Notes in Computer Science(), vol 2499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45820-4_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-45820-4_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44282-0

  • Online ISBN: 978-3-540-45820-3

  • eBook Packages: Springer Book Archive