Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8003))

  • 1157 Accesses

Abstract

An implementation of a non-structural example-based translation system that translates sentences from Arabic to English, using a bilingual parallel corpus, is described. Each new input sentence is fragmented into phrases, and those phrases are matched to example patterns, using various levels of morphological data. We study the effect of forcing the system to match only fragments that do not break base phrases in the middle, and the results for small corpora are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banerjee, S., Lavie, A.: Meteor: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In: Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization, 43rd Annual Meeting of the Association for Computational Linguistics (ACL), Ann Arbor, MI, pp. 65–72 (2005)

    Google Scholar 

  2. Bar, K., Dershowitz, N.: Semantics for Example-Based Arabic Machine Translation. In: Soudi, A., Vogel, S., Neumann, G., Farghaly, A. (eds.) Challenges for Arabic Machine Translation. Natural Language Processing Series, pp. 49–72. John Benjamins, Amsterdam (2012)

    Chapter  Google Scholar 

  3. Brown, R.D.: Adding Linguistic Knowledge to a Lexical Example-Based Translation System. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 22–32 (1999)

    Google Scholar 

  4. Brown, P.F., Cocke, J., Pietra, S.A.D., Pietra, V.J.D., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin, P.S.: A Statistical Approach to Machine Translation. Computational Linguistics 6(2), 79–85 (1990)

    Google Scholar 

  5. Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19(2), 263–311 (1993)

    Google Scholar 

  6. Buckwalter, T.: Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, Philadelphia (2002)

    Google Scholar 

  7. Diab, M., Hacioglu, K., Jurafsky, D.: Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), pp. 149–152. The National Science Foundation, Washington, DC (2004)

    Google Scholar 

  8. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  9. Groves, D., Way, A.: Hybrid Example-Based SMT: The Best of Both Worlds? In: Proceedings of the ACL 2005 Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Ann Arbor, MI, pp. 183–190 (2005)

    Google Scholar 

  10. Habash, N., Rambow, O.: Arabic Tokenization, Morphological Analysis, and Part-of-Speech Tagging in One Fell Swoop. In: Proceedings of the Conference of American Association for Computational Linguistics, Ann Arbor, MI, pp. 578–580 (2005)

    Google Scholar 

  11. Habash, N., Rambow, O., Roth, R.: MADA+TOKAN: A Toolkit for Arabic Tokenization, Diacritization, Morphological Disambiguation, POS Tagging, Stemming and Lemmatization. In: Proceedings of the Second International Conference on Arabic Language Resources and Tools, pp. 102–109. The MEDAR Consortium, Cairo (2009)

    Google Scholar 

  12. Koehn, P., Hoang, H.: Factored Translation Models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 868–876 (2007)

    Google Scholar 

  13. Koehn, P., Och, F.J., Marcu, D.: Statistical Phrase-Based Translation. In: Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, pp. 48–54 (2003)

    Google Scholar 

  14. Maruyama, H., Watanabe, H.: Tree Cover Search Algorithm for Example-Based Translation. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 173–184 (1992)

    Google Scholar 

  15. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  16. Nagao, M.: A Framework of Mechanical Translation between Japanese and English by Analogy Principle. In: Elithorn, A., Banerji, R. (eds.) Artificial and Human Intelligence, pp. 351–354. North-Holland (1984)

    Google Scholar 

  17. Nirenburg, S., Beale, S., Domashnev, C.: A Full-Text Experiment in Example-Based Machine Translation. In: International Conference on New Methods in Language Processing (NeMLaP), Manchester, UK, pp. 78–87 (1994)

    Google Scholar 

  18. Och, F.J., Ney, H.: The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics 30(4), 418–449 (2003)

    MATH  Google Scholar 

  19. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp. 311–318 (2002)

    Google Scholar 

  20. Phillips, A.B., Violetta, C.-S., Brown, R.D.: Improving Example-Based Machine Translation through Morphological Generalization and Adaptation. In: Proceedings of Machine Translation Summit XI, Copenhagen, Denmark, pp. 369–375 (2006)

    Google Scholar 

  21. Ramshaw, L.A., Marcus, M.P.: Text Chunking Using Transformation Based Learning. In: Proceedings of the Third Workshop on Very Large Corpora in the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 82–94 (1995)

    Google Scholar 

  22. Roth, R., Rambow, O., Habash, N., Diab, M., Rudin, C.: Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking. In: Proceedings of Association for Computational Linguistics (ACL 2008), Columbus, OH, pp. 117–120 (2008)

    Google Scholar 

  23. Sato, S., Nagao, M.: Toward Memory-Based Translation. In: Proceedings of the International Conference on Computational Linguistics (COLING), vol. 13(3), pp. 247–252 (1990)

    Google Scholar 

  24. Somers, H.: Review Article: Example-Based Machine Translation. Machine Translation 14, 113–157 (1999)

    Article  Google Scholar 

  25. Sumita, E., Iida, H.: Heterogeneous Computing for Example-Based Translation of Spoken Language. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 273–286 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bar, K., Choueka, Y., Dershowitz, N. (2014). Matching Phrases for Arabic-to-English Example-Based Translation System. In: Dershowitz, N., Nissan, E. (eds) Language, Culture, Computation. Computational Linguistics and Linguistics. Lecture Notes in Computer Science, vol 8003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45327-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45327-4_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45326-7

  • Online ISBN: 978-3-642-45327-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics