Machine Translation

, Volume 17, Issue 4, pp 245–270 | Cite as

MT for Minority Languages Using Elicitation-Based Learning of Syntactic Transfer Rules

  • Katharina Probst
  • Lori Levin
  • Erik Peterson
  • Alon Lavie
  • Jaime Carbonell


The AVENUE project contains a run-time machine translationprogram that is surrounded by pre- and post-run-time modules. Thepost-run-time module selects among translation alternatives. Thepre-run-time modules are concerned with elicitation of data andautomatic learning of transfer rules in order to facilitate thedevelopment of machine translation between a language with extensiveresources for natural language processing and a language with fewresources for natural language processing. This paper describes therun-time transfer-based machine translation system as well as two ofthe pre-run-time modules: elicitation of data from the minoritylanguage and automated learning of transfer rules from theelicited data.

elicitation rule learning syntactic transfer rules minority languages 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bouquiaux, Luc and Jacqueline M. C. Thomas: 1992, Studying and Describing Unwritten Languages, Dallas, X: The Summer Institute of Linguistics.Google Scholar
  2. Brown, Peter F., John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer and Paul S. Roossin: 1990, ‘A Statistical Approach to Machine Translation’, Computational Linguistics 16, 79–85.Google Scholar
  3. Brown, Peter F., Stephen A. Della Pietra, Vincent J. Della Pietra and Robert L. Mercer: 1991, ‘The Mathematics of Statistical Machine Translation: Parameter Estimation’, Computational Linguistics 19, 263–311.Google Scholar
  4. Brown, Ralf D.: 1997, ‘Automated Dictionary Extraction for “Knowledge-Free” Example-Based Translation’, in Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation, Santa Fe, New Mexico, pp. 111–118.Google Scholar
  5. Comrie, Bernard: 1989, Language Universals and Linguistic Typology, 2nd edn, Oxford: Blackwell.Google Scholar
  6. Comrie, Bernard and Noah Smith: 1977, ‘Lingua Descriptive Series: Questionnaire’, Lingua 42, 1–72.Google Scholar
  7. Dorr, Bonnie Jean: 1992, Machine Translation: A View from the Lexicon, Cambridge, MA: MIT Press.Google Scholar
  8. Fillmore, Charles J., Paul Kay and Mary Catherine O'Connor: 1988, ‘Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone’, Language 64, 501–538.Google Scholar
  9. Gaussier, Éric: 1999, ‘Unsupervised Learning of Derivational Morphology from Inflectional Lexicons’, in Proceedings of the Workshop on Unsupervised Learning in Natural Language Processing at the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, pp. 24–30.Google Scholar
  10. Goldsmith, John: 2001, ‘Unsupervised Learning of the Morphology of a Natural Language’, Computational Linguistics 27, 153–198.Google Scholar
  11. Greenberg, Joseph H.: 1966, Universals of Language, 2nd edn, Cambridge, MA: MIT Press.Google Scholar
  12. Habash, Nizar and Bonnie Dorr: 2002, ‘Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation’, in Stephen D. Richardson (ed.) Machine Translation: From Research to Real Users, 5th Conference of the Association for Machine Translation in the Americas, AMTA 2002, Berlin: Springer, pp. 84–93.Google Scholar
  13. Hutchins, W. John and Harold L. Somers: 1992, An Introduction to Machine Translation, London: Academic Press.Google Scholar
  14. ILASH (Institute for Language, Speech and Hearing): 2002, Half-day workshop on Minority Languages and Computation, Sheffield, Scholar
  15. Jones, Douglas and Rick Havrilla: 1998, ‘Twisted Pair Grammar: Support for Rapid Development of Machine Translation for Low Density Languages’, in David Farwell, Laurie Gerber and Eduard Hovy (eds) Machine Translation and the Information Soup: Third Conference of the Association for Machine Translation in the Americas, AMTA'98..., Berlin: Springer, pp. 318–332.Google Scholar
  16. Lavie, Alon, Stephan Vogel, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell and Richard Cohen: forthcoming, ‘Experiments with a Hindi-to-English Transfer-based MT System under a Miserly Data Scenario’, to appear in ACM Transactions on Asian Language Information Processing.Google Scholar
  17. Levin, Lori, Donna Gates, Alon Lavie and Alex Waibel: 1998, ‘An Interlingua Based on Domain Actions for Machine Translation of Task-Oriented Dialogues’, in Proceedings of the International Conference on Spoken Language Processing (ICSLP'98), Sydney, Australia, pp. 1155–1158.Google Scholar
  18. Levin, Lori, Rodolfo Vega, Jaime Carbonell, Ralf Brown, Alon Lavie, Eliseo Cañulef and Carolina Huenchullan: 2002, ‘Data Collection and Language Technologies for Mapudungun’, in International Workshop on Resources and Tools in Field Linguistics, Las Palmas, Spain, pp. 18–1–18–4.Google Scholar
  19. Melamed, I. Dan: 1998, Manual Annotation of Translational Equivalence: The Blinker Project, IRCS Technical Report 98–07, Institute for Research in Cognitive Science, University of Pennsylvania, Philadelphia, PA.Google Scholar
  20. Mitchell, Tom: 1982, ‘Generalization as Search’, Artificial Intelligence 18, 203–226.Google Scholar
  21. Nirenburg, Sergei: 1998, ‘Project Boas: A Linguist in the Box as a Multi-Purpose Language’, in Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain, pp. 739–746.Google Scholar
  22. Och, Franz Josef and Hermann Ney: 2002, ‘Discriminative Training and Maximum Entropy Models for Statistical Machine Translation’, in 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 295–302.Google Scholar
  23. Papineni, Kishore, Salim Roukos and Todd Ward: 1998, ‘Maximum Likelihood and Discriminative Training of Direct Translation Models’, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98), Seattle,WA, pp. 189–192.Google Scholar
  24. Peterson, Erik: 2002, Adapting a Transfer Engine for Rapid Machine Translation Development, Masters Research Paper, Georgetown University.Google Scholar
  25. Probst, Katharina: 2003, ‘Using “smart” Bilingual Projection to Feature-tag a Monolingual Dictionary’, in Seventh Conference on Natural Language Learning, Edmonton, Canada, pp. 103–110.Google Scholar
  26. Probst, Katharina, Ralf Brown, Jaime Carbonell, Alon Lavie, Lori Levin and Erik Peterson: 2001, ‘Design and Implementation of Controlled Elicitation for Machine Translation of Low-density Languages’, in MT Summit VIII Workshop: MT 2010 – Towards a Road Map for MT, Santiago de Compostela, Spain, pp. 44–49.Google Scholar
  27. Probst, Katharina and Lori Levin: 2002, ‘Challenges in Automated Elicitation of a Controlled Bilingual Corpus’, in Proceedings of the 9th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-2002), Keihanna, Japan, pp. 157–167.Google Scholar
  28. Shieber, Stuart M.: 1986, An Introduction to Unification-Based Approaches to Grammar, Stanford, CA: Center for the Study of Language and Information.Google Scholar
  29. Tomita, Masaru (ed.), Teruko Mitamura, Hiroyuki Musha and Marion Kee: 1988, The Generalized LR Parser/Compiler Version 8.1: User's Guide, Center for Machine Translation Technical Report, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
  30. Trujillo, Arturo: 1999, Translation Engines: Techniques for Machine Translation, Berlin: Springer.Google Scholar
  31. Vogel, Stephan and Alicia Tribble: 2002, ‘Improving Statistical Machine Translation for a Speechto-Speech Translation Task’, in Proceedings of the Workshop on Speech-to-Speech Translation at the 7th International Conference on Spoken Language Processing (ICSLP 2002/Interspeech 2002), Boulder, CO, pp. 1901–1904.Google Scholar
  32. Yamada, Kenji and Kevin Knight: 2001, ‘A Syntax-based Statistical Translation Model’, in Association for Computational Linguistics 39th Annual Meeting and 10th Conference of the European Chapter, Toulouse, France, pp. 523–530.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Katharina Probst
    • 1
  • Lori Levin
    • 1
  • Erik Peterson
    • 1
  • Alon Lavie
    • 1
  • Jaime Carbonell
    • 1
  1. 1.Language Technologies InstituteCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations