Abstract
The AVENUE project contains a run-time machine translationprogram that is surrounded by pre- and post-run-time modules. Thepost-run-time module selects among translation alternatives. Thepre-run-time modules are concerned with elicitation of data andautomatic learning of transfer rules in order to facilitate thedevelopment of machine translation between a language with extensiveresources for natural language processing and a language with fewresources for natural language processing. This paper describes therun-time transfer-based machine translation system as well as two ofthe pre-run-time modules: elicitation of data from the minoritylanguage and automated learning of transfer rules from theelicited data.
Similar content being viewed by others
References
Bouquiaux, Luc and Jacqueline M. C. Thomas: 1992, Studying and Describing Unwritten Languages, Dallas, X: The Summer Institute of Linguistics.
Brown, Peter F., John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer and Paul S. Roossin: 1990, ‘A Statistical Approach to Machine Translation’, Computational Linguistics 16, 79–85.
Brown, Peter F., Stephen A. Della Pietra, Vincent J. Della Pietra and Robert L. Mercer: 1991, ‘The Mathematics of Statistical Machine Translation: Parameter Estimation’, Computational Linguistics 19, 263–311.
Brown, Ralf D.: 1997, ‘Automated Dictionary Extraction for “Knowledge-Free” Example-Based Translation’, in Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation, Santa Fe, New Mexico, pp. 111–118.
Comrie, Bernard: 1989, Language Universals and Linguistic Typology, 2nd edn, Oxford: Blackwell.
Comrie, Bernard and Noah Smith: 1977, ‘Lingua Descriptive Series: Questionnaire’, Lingua 42, 1–72.
Dorr, Bonnie Jean: 1992, Machine Translation: A View from the Lexicon, Cambridge, MA: MIT Press.
Fillmore, Charles J., Paul Kay and Mary Catherine O'Connor: 1988, ‘Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone’, Language 64, 501–538.
Gaussier, Éric: 1999, ‘Unsupervised Learning of Derivational Morphology from Inflectional Lexicons’, in Proceedings of the Workshop on Unsupervised Learning in Natural Language Processing at the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, pp. 24–30.
Goldsmith, John: 2001, ‘Unsupervised Learning of the Morphology of a Natural Language’, Computational Linguistics 27, 153–198.
Greenberg, Joseph H.: 1966, Universals of Language, 2nd edn, Cambridge, MA: MIT Press.
Habash, Nizar and Bonnie Dorr: 2002, ‘Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation’, in Stephen D. Richardson (ed.) Machine Translation: From Research to Real Users, 5th Conference of the Association for Machine Translation in the Americas, AMTA 2002, Berlin: Springer, pp. 84–93.
Hutchins, W. John and Harold L. Somers: 1992, An Introduction to Machine Translation, London: Academic Press.
ILASH (Institute for Language, Speech and Hearing): 2002, Half-day workshop on Minority Languages and Computation, Sheffield, www.dcs.shef.ac.uk/research/ilash/Meetings/ML.html.
Jones, Douglas and Rick Havrilla: 1998, ‘Twisted Pair Grammar: Support for Rapid Development of Machine Translation for Low Density Languages’, in David Farwell, Laurie Gerber and Eduard Hovy (eds) Machine Translation and the Information Soup: Third Conference of the Association for Machine Translation in the Americas, AMTA'98..., Berlin: Springer, pp. 318–332.
Lavie, Alon, Stephan Vogel, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell and Richard Cohen: forthcoming, ‘Experiments with a Hindi-to-English Transfer-based MT System under a Miserly Data Scenario’, to appear in ACM Transactions on Asian Language Information Processing.
Levin, Lori, Donna Gates, Alon Lavie and Alex Waibel: 1998, ‘An Interlingua Based on Domain Actions for Machine Translation of Task-Oriented Dialogues’, in Proceedings of the International Conference on Spoken Language Processing (ICSLP'98), Sydney, Australia, pp. 1155–1158.
Levin, Lori, Rodolfo Vega, Jaime Carbonell, Ralf Brown, Alon Lavie, Eliseo Cañulef and Carolina Huenchullan: 2002, ‘Data Collection and Language Technologies for Mapudungun’, in International Workshop on Resources and Tools in Field Linguistics, Las Palmas, Spain, pp. 18–1–18–4.
Melamed, I. Dan: 1998, Manual Annotation of Translational Equivalence: The Blinker Project, IRCS Technical Report 98–07, Institute for Research in Cognitive Science, University of Pennsylvania, Philadelphia, PA.
Mitchell, Tom: 1982, ‘Generalization as Search’, Artificial Intelligence 18, 203–226.
Nirenburg, Sergei: 1998, ‘Project Boas: A Linguist in the Box as a Multi-Purpose Language’, in Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain, pp. 739–746.
Och, Franz Josef and Hermann Ney: 2002, ‘Discriminative Training and Maximum Entropy Models for Statistical Machine Translation’, in 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 295–302.
Papineni, Kishore, Salim Roukos and Todd Ward: 1998, ‘Maximum Likelihood and Discriminative Training of Direct Translation Models’, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98), Seattle,WA, pp. 189–192.
Peterson, Erik: 2002, Adapting a Transfer Engine for Rapid Machine Translation Development, Masters Research Paper, Georgetown University.
Probst, Katharina: 2003, ‘Using “smart” Bilingual Projection to Feature-tag a Monolingual Dictionary’, in Seventh Conference on Natural Language Learning, Edmonton, Canada, pp. 103–110.
Probst, Katharina, Ralf Brown, Jaime Carbonell, Alon Lavie, Lori Levin and Erik Peterson: 2001, ‘Design and Implementation of Controlled Elicitation for Machine Translation of Low-density Languages’, in MT Summit VIII Workshop: MT 2010 – Towards a Road Map for MT, Santiago de Compostela, Spain, pp. 44–49.
Probst, Katharina and Lori Levin: 2002, ‘Challenges in Automated Elicitation of a Controlled Bilingual Corpus’, in Proceedings of the 9th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-2002), Keihanna, Japan, pp. 157–167.
Shieber, Stuart M.: 1986, An Introduction to Unification-Based Approaches to Grammar, Stanford, CA: Center for the Study of Language and Information.
Tomita, Masaru (ed.), Teruko Mitamura, Hiroyuki Musha and Marion Kee: 1988, The Generalized LR Parser/Compiler Version 8.1: User's Guide, Center for Machine Translation Technical Report, Carnegie Mellon University, Pittsburgh, PA.
Trujillo, Arturo: 1999, Translation Engines: Techniques for Machine Translation, Berlin: Springer.
Vogel, Stephan and Alicia Tribble: 2002, ‘Improving Statistical Machine Translation for a Speechto-Speech Translation Task’, in Proceedings of the Workshop on Speech-to-Speech Translation at the 7th International Conference on Spoken Language Processing (ICSLP 2002/Interspeech 2002), Boulder, CO, pp. 1901–1904.
Yamada, Kenji and Kevin Knight: 2001, ‘A Syntax-based Statistical Translation Model’, in Association for Computational Linguistics 39th Annual Meeting and 10th Conference of the European Chapter, Toulouse, France, pp. 523–530.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Probst, K., Levin, L., Peterson, E. et al. MT for Minority Languages Using Elicitation-Based Learning of Syntactic Transfer Rules. Machine Translation 17, 245–270 (2002). https://doi.org/10.1023/B:COAT.0000021003.55041.fd
Issue Date:
DOI: https://doi.org/10.1023/B:COAT.0000021003.55041.fd