Machine Translation

, 22:205 | Cite as

Generating Arabic text in multilingual speech-to-speech machine translation framework

  • Azza Abdel Monem
  • Khaled Shaalan
  • Ahmed Rafea
  • Hoda Baraka
Article

Abstract

The interlingual approach to machine translation (MT) is used successfully in multilingual translation. It aims to achieve the translation task in two independent steps. First, meanings of the source-language sentences are represented in an intermediate language-independent (Interlingua) representation. Then, sentences of the target language are generated from those meaning representations. Arabic natural language processing in general is still underdeveloped and Arabic natural language generation (NLG) is even less developed. In particular, Arabic NLG from Interlinguas was only investigated using template-based approaches. Moreover, tools used for other languages are not easily adaptable to Arabic due to the language complexity at both the morphological and syntactic levels. In this paper, we describe a rule-based generation approach for task-oriented Interlingua-based spoken dialogue that transforms a relatively shallow semantic interlingual representation, called interchange format (IF), into Arabic text that corresponds to the intentions underlying the speaker’s utterances. This approach addresses the handling of the problems of Arabic syntactic structure determination, and Arabic morphological and syntactic generation within the Interlingual MT approach. The generation approach is developed primarily within the framework of the NESPOLE! (NEgotiating through SPOken Language in E-commerce) multilingual speech-to-speech MT project. The IF-to-Arabic generator is implemented in SICStus Prolog. We conducted evaluation experiments using the input and output from the English analyzer that was developed by the NESPOLE! team at Carnegie Mellon University. The results of these experiments were promising and confirmed the ability of the rule-based approach in generating Arabic translation from the Interlingua taken from the travel and tourism domain.

Keywords

Machine translation Interlingua Rule-based text generation Natural language generation Arabic natural language processing 

References

  1. Akiba Y, Federico M, Kando N, Nakaiwa H, Paul M, Tsujii J (2004) Overview of the IWSLT04 evaluation campaign. In: Proceedings of the international workshop on spoken language translation, Kyoto, Japan, pp 1–12Google Scholar
  2. Al-Sughaiyer I, Al-Kharashi I (2004) Arabic morphological analysis techniques: a comprehensive survey. J Am Soc Inform Sci Technol 55(3): 189–213CrossRefGoogle Scholar
  3. Arnold D, Balkan L, Meijer S, Humphreys L, Sadler L (1994) Machine translation: an introductory guide. Blackwell-NCC, LondonGoogle Scholar
  4. Attia M (2008) Handling Arabic morphological and syntactic ambiguities within the LFG framework with a view to machine translation. Ph.D. Thesis, University of Manchester, Manchester, UKGoogle Scholar
  5. Beesley K (1996) Arabic finite-state morphological analysis and generation. In: COLING-96: the 16th international conference on computational linguistics, vol 1, Copenhagen, Denmark, pp 89–94Google Scholar
  6. Buckwalter T (2002) Buckwalter Arabic morphological analyzer version 1.0, Linguistic Data Consortium, LDC Catalog No.: LDC2002L49. University of Pennsylvania, Philadelphia, PAGoogle Scholar
  7. Cavalli-Sforza V, Soudi A, Mitamura T (2000) Arabic morphology generation using a concatenative strategy. In: ANLP 2000, 6th Applied Natural Language Processing Conference, Seattle, WA, pp 86–93Google Scholar
  8. Dorr B (1993) Machine translation. MIT Press, Cambridge, MAGoogle Scholar
  9. Dorr B, Hovy E, Levin L (2004) Machine translation: interlingual methods. In: Brown K (eds) Encyclopedia of language and linguistics. Elsevier, Oxford, UKGoogle Scholar
  10. El-Desouki A, Abd Elgawwad A, Saleh M (1996) A proposed algorithm for English-Arabic machine translation system. In: Proceeding of the 1st KFUPM workshop on information and computer sciences (WICS): machine translation, Dhahran, Saudi Arabia, pp 32–39Google Scholar
  11. El-Saka T, Rafea A, Rafea M, Madkour M (1999) English to Arabic knowledge base translation tool. In: Proceedings of the 7th international conference on artificial intelligence applications (ICAIA), Cairo, Egypt, pp 66–72Google Scholar
  12. Gavaldà M (2004) SOUP: a parser for real-world spontaneous speech. In: New developments in parsing technology, vol 23, Chap.  17. Kluwer Academic Publishers, Norwell, MA, pp 339—350 (Also published in the Proceedings of the 6th international workshop on parsing technologies (IWPT-2000), Trento, Italy)Google Scholar
  13. Geist RJ (1971) An introduction to transformation grammar. Macmillan, New York, NYGoogle Scholar
  14. Guessoum A, Zantout R (2007) Arabic morphological generation and its impact on the quality of the machine translation to Arabic. In: Soudi A, Bosch A, Neumann G (eds) Arabic computational morphology: knowledge-based and empirical methods, text and language technology. Springer, New York, pp 287–302Google Scholar
  15. Habash N (2004) Large-scale lexeme-based Arabic morphological generation. In: Proceedings of Traitement Automatique du Langage Naturel (TALN-04), Fez, Morocco, pp 45–51Google Scholar
  16. Habash N, Dorr B, Monz C (2006) Challenges in building an Arabic GHMT system with SMT components. In: AMTA 2006, Proceedings of the 7th conference of the association for machine translation in the Americas, visions for the future of machine translation, Cambridge, MA, pp 56–65Google Scholar
  17. Hiroshi U, Meiying Z (1993) Interlingua for multilingual machine translation. In: Proceedings of MT Summit IV, Kobe, Japan, pp 157–169Google Scholar
  18. Hutchins J (2003) Machine translation: general overview. In: Mitkov R (eds) The Oxford handbook of computational linguistics, Chap. 27. Oxford University Press, Oxford, pp 501–511Google Scholar
  19. Hutchins J, Somers H (1992) An introduction to machine translation. Academic Press, London, UKMATHGoogle Scholar
  20. Ibrahim M (1991) A fast and expert machine translation system involving Arabic language. Ph.D. Thesis, Cranfield Institute of Technology, Cranfield, UKGoogle Scholar
  21. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the association for computational linguistics (ACL), demo and poster sessions, Prague, Czech Republic, pp 177–180Google Scholar
  22. Koehn P, Och F, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the joint human language technology conference and the annual meeting of the North American chapter of the association for computational linguistics (HLT-NAACL), Edmonton, Canada, pp 127–133Google Scholar
  23. Langley C (2003) Domain action classification and argument parsing for interlingua-based spoken language translation. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PAGoogle Scholar
  24. Lavie A, Langley C, Waibel A, Lazzari G, Pianesi F, Coletti P, Balducci F, Taddei L (2001a) Architecture and design considerations in NESPOLE!: a speech translation system for E-commerce applications. In: Proceedings of human language technology conference (HLT 2001), San Diego, CA, pp 15–22Google Scholar
  25. Lavie A, Levin L, Schultz T, Langley C, Han B, Tribble A, Gates D, Wallace D, Peterson K (2001b) Domain portability in speech-to-speech translation. In: Proceedings of human language technology conference (HLT 2001), San Diego, CA, pp 23–29Google Scholar
  26. Lazzari G (2003) Evaluation of the NESPOLE! showcase-2a system. NESPOLE! Project deliverable D18. Available at http://nespole.itc.it/public/deliverables/D18_final.doc
  27. Leavitt J (1994) MORPHE: A morphological rule compiler. Technical Report, CMU-CMT-94-MEMO, Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PAGoogle Scholar
  28. Levin L, Gates D, Lavie A, Waibel A (1998) An Interlingua based on domain actions for machine translation of task-oriented dialogues. In: Proceedings of The 5th international conference on spoken language processing (CSLP’98), vol 4, Sydney, Australia, pp 1155–1158Google Scholar
  29. Levin L, Lavie A, Woszczyna M, Gates D, Gavalda M, Koll D, Waibel A (2000) The Janus III translation system: speech-to-speech translation in multiple domains. Mach Trans 15(1–2): 3–25MATHCrossRefGoogle Scholar
  30. Levin L, Gates D, Wallace D, Peterson K, Lavie A (2002) Balancing expressiveness and simplicity in an Interlingua for task-based dialogue. In: Proceedings of the workshop on speech-to-speech translation: algorithms and systems, Association for Computational Linguistics, Philadelphia, PA, pp 52–59Google Scholar
  31. Levin L, Gates D, Wallace D, Peterson K, Pianta E, Mana N (2003b) The NESPOLE! interchange format, project deliverable D13. Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PAGoogle Scholar
  32. Levin L, Langley C, Lavie A, Gates D, Wallace D, Peterson K (2003a) Domain specific speech acts for spoken language translation. In : The Proceedings of 4th SIGDIAL workshop on discourse and dialogue (SIGDIAL-2003), Association for Computational Linguistics, Sapporo, Japan, pp 44–49Google Scholar
  33. Mace J (1998) Arabic grammar: a reference guide. Edinburgh University Press, Edinburgh, UKGoogle Scholar
  34. Metze F, McDonough J, Soltau H, Lavie A, Levin L, Langley C, Schultz T, Waibel A, Cattoni R, Lazzari G, Mana N, Pianesi F, Pianta E, (2002) Enhancing the usability and performance of NESPOLE!: a real-world speech-to-speech translation system. In: Proceedings of human language technology conference (HLT 2002), San Diego, CA, pp 269–274Google Scholar
  35. Mitamura T, Nyberg E, (1992) Hierarchical lexical structure and interpretive mapping in. In: Proceedings of the fifteenth [sic] international conference on computational linguistics COLING-92, Nantes, France, pp 1254–1258Google Scholar
  36. Mokhtar H, Darwish N, Rafea A (2000) An automated system for English–Arabic translation of scientific texts (SEATS). In : Proceedings of MT2000: machine translation and multilingual applications in the new millennium, the British Computer Society (BCS), London, pp 1–5Google Scholar
  37. Och F, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp 295–302Google Scholar
  38. Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp 311–318Google Scholar
  39. Patch K (2003) PDA translates speech. Technology Research News. Available at http://www.trnmag.com/Stories/2003/121703/PDA_translates_speech_121703.html
  40. Pease C, Boushaba A (1996) Towards an automatic translation of medical terminology and texts into Arabic. In: Proceedings of the translation in the Arab world, King Fahd Advanced School of Translation, Tangier, Morocco, pp 18–23Google Scholar
  41. Quah CK (2006) Translation and technology. Palgrave MacMillan, Basingstoke, UKGoogle Scholar
  42. Ryding K (2005) Reference grammar of modern standard Arabic. Cambridge University Press, Cambridge, UKGoogle Scholar
  43. Rafea A, Shaalan K (1993) Lexical analysis of inflected Arabic words using exhaustive search of an augmented transition network. Softw Pract Exper 23(6): 567–588CrossRefGoogle Scholar
  44. Rafea A, Sabry M, El-Ansary R, Samir S (1992) Al-Mutargem: a machine translator for Middle East news. In: Proceedings of the 3rd international conference and exhibition on multi-lingual computing, The Centre for Middle Eastern and Islamic Studies, University of Durham, Durham, UK, pp 53–60Google Scholar
  45. Riezler S, Maxwell J (2006) Grammatical machine translation. In : Proceedings of human language technology conference of the North American chapter of the association for computational linguistics annual meeting (HLT-NAACL’06), New York, NY, pp 248–255Google Scholar
  46. Salem Y, Hensman A, Nolan B (2008) Towards Arabic to English machine translation. Acad J Inst Technol Blanchardstown (Dublin, Ireland) 17:20–31. Available online via http://informatics.itbresearch.ie/~ysalem/pdf/ITB%20Journal-May-2008-v5.pdf
  47. Shaalan K, Rafea M, Rafea A (1998) KROL: a knowledge representation object language on top of Prolog. Expert Syst Appl 15: 33–46CrossRefGoogle Scholar
  48. Shaalan K, Rafea A, Abdel Monem A, Baraka H (2004) Machine translation of English noun phrases into Arabic. Int J Comput Process Oriental Lang (IJCPOL) 17(2): 121–134CrossRefGoogle Scholar
  49. Shaalan K (2005) An Intelligent Computer Assisted Language Learning System for Arabic Learners. Comput Assist Lang Learn 18: 81–108CrossRefGoogle Scholar
  50. Shaalan K (2005) Arabic GramCheck: a grammar checker for Arabic. Softw Pract Exper 35(7): 643–665CrossRefGoogle Scholar
  51. Shaalan K, Abdel Monem A, Rafea A (2006) Arabic morphological generation from Interlingua: a rule-based approach. In: Shi Z, Shimohara K, Feng D (eds) Intelligent information processing III, International Federation for information processing (IFIP). Springer, Boston, MA, pp 441–451Google Scholar
  52. Shaalan K, Abdel Monem A, Rafea A, Baraka H, (2006b) Mapping Interlingua representations to feature structures of Arabic sentences. The challenge of Arabic for NLP/MT, International Conference, the British Computer Society, London, pp 149–159Google Scholar
  53. Shaalan K, Abdel Monem A, Rafea A, Baraka H (2007) Generating Arabic text from Interlingua. In: Proceedings of the 2nd workshop on computational approaches to Arabic script-based languages (CAASL-2), Linguistic Institute, Stanford, CA, pp 137–144Google Scholar
  54. Soudi A, Cavalli-Sforza V, Jamari A (2002) A prototype English-to-Arabic Interlingua-based MT system. In: Proceedings of the workshop on Arabic language resources and evaluation: status and prospects, 3rd international conference on language resources and evaluation (LREC 2002), Las Palmas de Gran Canaria, Spain, pp 18–25Google Scholar
  55. Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, vol 2. Denver, USA, pp 901–904Google Scholar
  56. Theune M (2003) Natural language generation for dialogue: system survey. Language Engineering Group, University of Twente, Twente, The NetherlandsGoogle Scholar
  57. Tomita M, Nyberg E (1988) Generation kit and transformation kit, version 3.2, user’s manual. Technical Report, Carnegie Mellon Center for Machine Translation, Pittsburgh, PAGoogle Scholar
  58. Trujillo A (1999) Translation engines: techniques for machine translation. Springer Verlag, London, UKMATHGoogle Scholar
  59. Vauquois B (1968) A survey of formal grammars and algorithms for recognition and transformation in machine translation. In: Proceedings of international federation for information processing congress 68, Edinburgh, UK, pp a4–260Google Scholar
  60. Waibel A, Badran A, Black A, Frederking R, Gates D, Lavie A, Levin L, Lenzo K, Tomokiyo L, Reichert J, Schultz T, Wallace D, Woszczyna M, Zhang J (2003a) Speechalator: two-way speech-to-speech translation on a consumer PDA. In: Proceedings of EUROSPEECH 2003, Geneva, Switzerland, pp 369–372Google Scholar
  61. Waibel A, Badran A, Black A, Frederking R, Gates D, Lavie A, Levin L, Lenzo K, Tomokiyo L, Reichert J, Schultz T, Wallace D, Woszczyna M, Zhang J (2003b) Speechalator: two-way speech-to-speech translation in your hand. In: Proceedings of the joint human language technology conference and the annual meeting of the north American chapter of the association for computational linguistics (HLT-NAACL), Edmonton, Canada, pp 29–30Google Scholar
  62. White JS, O’Connell T, O’Mara F (1994) The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In: Proceedings of the first conference of the association for machine translation in the Americas (AMTA), Columbia, MD, pp 193–205Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Azza Abdel Monem
    • 1
  • Khaled Shaalan
    • 2
    • 3
  • Ahmed Rafea
    • 4
  • Hoda Baraka
    • 5
  1. 1.Faculty of Computer and Information SciencesAin Shams UniversityCairoEgypt
  2. 2.School of InformaticsUniversity of EdinburghEdinburghUK
  3. 3.Faculty of InformaticsThe British University in DubaiDubaiUAE
  4. 4.Computer Science DepartmentAmerican University in CairoCairoEgypt
  5. 5.Faculty of EngineeringCairo UniversityGizaEgypt

Personalised recommendations