Machine Translation

, Volume 9, Issue 3–4, pp 251–283 | Cite as

Acquisition of large lexicons for practical knowledge-based MT

  • Deryle Lonsdale
  • Teruko Mitamura
  • Eric Nyberg


Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded lexical knowledge. Systems like KBMT-89 and its descendents have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Nevertheless, the magnitude of the development task for large-scale applications with tens of thousands of domain concepts precludes a purely hand-crafted approach. The current challenge for the “next generation” of knowledge-based MT systems is to utilize on-line textual resources and corpus analysis software in order to automate the most laborious aspects of the knowledge acquisition process. This partial automation can in turn maximize the productivity of human knowledge engineers and help to make large-scale applications of knowledge-based MT an viable approach. In this paper we discuss the corpus-based knowledge acquisition methodology used in KANT, a knowledge-based translation system for multilingual document production. This methodology can be generalized beyond the KANT interlingua approach for use with any system that requires similar kinds of knowledge.


knowledge-based machine translation conceptual coverage on-line lexical acquisition lexical mapping phrasal substructure 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bauer, L., editor. 1978.The Grammar of Nominal Compounding with Special Reference to Danish, English, and French. Odense University Press.Google Scholar
  2. Bédard, C. 1986. La Traduction Technique: Principes et Pratique.Linguatech.Google Scholar
  3. Bourigault, D. 1992. Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases. InProceedings of COLING-92.Google Scholar
  4. Brown, R.D. 1991. Automatic and Interactive Augmentation. InA Case Study in Knowledge-Based Machine Translation. Morgan Kaufmann, San Mateo, CA.Google Scholar
  5. Chen, K. and H. Chen. 1994. Extracting Noun Phrases form Large-Scale Texts: A Hybrid Approach and Its Automatic Evaluation. InProceedings of ACL-94.Google Scholar
  6. Church, K. 1988. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. InProceedings of the Second Conference on Applied Natural Language Processing.Google Scholar
  7. Francis, W. and H. Kučera, editors. 1982.Frequency Analysis of English Usage. Houghton Mifflin, Boston, MA.Google Scholar
  8. Galinski, C. 1988. Advanced Terminology Banks Supporting Knowledge-Based MT. In D. Maxwell, K. Schubert, and A. Witkam, editors,New Directions in Machine Translation. Foris.Google Scholar
  9. Goodman, K. and S. Nirenburg, editors. 1991.A Case Study in Knowledge-Based Machine Translation. Morgan Kaufmann, San Mateo, CA.Google Scholar
  10. Grishman, R., C. Macleod, and A. Meyers. 1994. Comlex Syntax: Building a Computational Lexicon. InProceedings of COLING-94.Google Scholar
  11. Leavitt, J., D. Lonsdale, K. Keck, and E. Nyberg. 1994. Tooling the Lexicon Acquisition Process for Large-Scale KBMT. InProceedings of IEEE Tools for AI.Google Scholar
  12. Lonsdale, D. 1994. Extraction d'un Vocabulaire Bilingue: Outils et Méthodes. In A. Clas and P. Bouillon, editors,Actes du Colloque Lexicologie, Terminologie et Traduction. Les Presses de l'Université de Montréal.Google Scholar
  13. Lonsdale, D., A. Franz, and J. Leavitt. 1994. Large-scale Machine Translation: An Interlingua Approach. InProceedings of IEA/AIE-94.Google Scholar
  14. Mitamura, T. 1989.The Hierarchical Organization of Predicate Frames for Interpretive Mapping in Natural Language Processing. Ph.D. thesis, University of Pittsburgh.Google Scholar
  15. Mitamura, T. and E. Nyberg. 1992. Hierarchical Lexical Structure and Interpretive Mapping in MT. InProceedings of COLING 1992, Nantes, France, July.Google Scholar
  16. Mitamura, T., E. Nyberg, and J. Carbonell. 1991. An Efficient Interlingua Translation System for Multi-lingual Document Production. InProceedings of Machine Translation Summit III, Washington, DC, July.Google Scholar
  17. Mitamura, T., E. Nyberg, and J. Carbonell. 1993. Automated Corpus Analysis and the Acquisition of Large, Multi-Lingual Knowledge Bases for MT. InProceedings of TMI-93.Google Scholar
  18. Nyberg, E. and T. Mitamura. 1992. The KANT System: Fast, Accurate, High-Quality Translation in Practical Domains. InProceedings of COLING-92, Nantes, France, July.Google Scholar
  19. Nyberg, E., T. Mitamura, and J. Carbonell. 1994. Evaluation Metrics for Knowledge-Based Machine Translation. InProceedings of COLING-94.Google Scholar
  20. Tomita, M., M. Kee, T. Mitamura, and J. Carbonell. 1987. Linguistic and Domain Knowledge Sources for the Universal Parser Architecture. In H. Czap and C. Galinski, editors,Terminology and Knowledge Engineering. INDEKS Verlag, Frankfurt, Germany, pages 191–203.Google Scholar
  21. Tsujii, T. 1988. What is a Cross-linguistically Valid Interpretation of Discourse? In D. Maxwell, K. Schubert, and A. Witkam, editors,New Directions in Machine Translation. Foris.Google Scholar
  22. Warren, B. 1978. Semantic Patterns of Noun-Noun Compounds. Technical report, Acta Universitatis Gothoburgensis.Google Scholar

Copyright information

© Kluwer Academic Publishers 1995

Authors and Affiliations

  • Deryle Lonsdale
    • 1
  • Teruko Mitamura
    • 1
  • Eric Nyberg
    • 1
  1. 1.Center for Machine TranslationCarnegie Mellon UniversityPittsburgh

Personalised recommendations