Abstract
This paper proposes English to Tamil machine translation system, using the universal networking language (UNL) as the intermediate representation. The UNL approach is a hybrid approach of the rule and knowledge-based approaches to machine translation. UNL is a declarative formal language, specifically designed to represent semantic data extracted from a natural language text. The input English sentence is converted to UNL (enconversion), which is then converted to a Tamil sentence (deconversion) by ensuring that the meaning of the input sentence is preserved. The representation of UNL was modified to suit the translation process. A new sentence formation algorithm was also proposed to rearrange the translated Tamil words to sentences. The translation system was evaluated using bilingual evaluation understudy (BLEU) score. A BLEU score of 0.581 was achieved, which is an indication that most of the information in the input sentence is retained in the translated sentence. The scores obtained using the UNL based approach were compared with existing approaches to translation, and it can be concluded that the UNL is a more suited approach to machine translation.
Similar content being viewed by others
References
James F Allen 2003 Natural language processing
William John Hutchins and Harold L Somers 1992 An introduction to machine translation
Jonathan Slocum 1985 A survey of machine translation: its history, current status, and future prospects. Comput. Linguist. 11(1): 1–17
David Chiang 2005 A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2005), pp. 263–270
Franz Och Zens Richard and Hermann Ney 2002 Phrase-based statistical machine translation. In: Proceedings of KI 2002: Advances in Artificial Intelligence. pp. 35–36
Shrikanth Narayanan Sridhar, Vivek Kumar Rangarajan and Srinivas Bangalore 2008 Enriching spoken language translation with dialog acts. In: Proceedings of Association for Computational Linguistics,Short Papers (Companion Volume). pp. 225–228
Bhattacharyya P, Hegde J, Shah R M, Ramanathan A and Sasikumar M 2008 Simple syntactic and morphological processing can help English-Hindi statistical machine translation. In: Proceedings of International Joint Conference on Natural Language Processing. pp. 513–520
Richard Zens and Hermann Ney 2004 Improvements in phrase-based statistical machine translation. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 257–264
Hua Wu and Haifeng Wang 2007 Pivot language approach for phrase-based statistical machine translation. Mach. Transl. 21(3): 165–181
Gregory Grefenstette 1999 The World Wide Web as a resource for example-based machine translation tasks. In: Proceedings of the ASLIB Conference on Translating and the Computer, vol. 21
Constantine Domashnev, Nirenburg Sergei and Dean J Grannes 1993 Two approaches to matching in example-based machine translation. In: Proceedings of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation. pp. 47–57
Bar Kfir, Choueka Y and Dershowitz N 2007 An Arabic to English example-based translation system. In: Proceedings of Information and Communication Technologies International Symposium and Workshop on Arabic Natural Language Processing, pp. 355–359
Harold Somers 1999 Review article: Example-based machine translation. Mach. Transl. 14(2): 113–157
Konstantin Tretyakov 2007 Example-based machine translation of short phrases using the context equivalence principle
Daniel Jones 1992 Non-hybrid example-based machine translation architectures. In: Proceedings of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation. pp. 163–171
Yves Lepage and Etienne Denoual 2005 Purest ever example-based machine translation: Detailed presentation and assessment. Mach. Transl. 19(3): 251–282
Michael Carl, Andy Way and Walter Daelemans 2004. Recent advances in example-based machine translation. Comput. Linguist. 30(4): 516–520
Eugene Charniak, Kevin Knight and Kenji Yamada 2003 Syntax-based language models for statistical machine translation. In: Proceedings of MT Summit IX. 40–46
Andreas Zollmann and Ashish Venugopal 2006 Syntax augmented machine translation via chart parsing. In: Proceedings of the Association for Computational Linguistics Workshop on Statistical Machine Translation. pp. 138–141
Ruvan Weerasinghe 2003 A statistical machine translation approach to Sinhala-Tamil language translation. Towards an ICT enabled Society, pp. 136–141
Salai Aaviyamma MBA and Kathiravan K 2009 Problems related to Eng-Tam Translation. In: Proceedings of the International Forum for Information Technology in Tamil. pp. 169–172
Virach Sornlertlamvanich, Charoenpornsawat Paisarn and Thatsanee Charoenporn 2002 Improving translation quality of rule-based machine translation. In: Proceedings of the Association for Computational Linguistics COLING workshop on Machine translation in Asia, vol. 16, pp. 1–6
Margaret King, Hovy Eduard and Andrei Popescu-Belis 2002 Principles of context-based machine translation evaluation. Mach. Transl. 17(1): 43–75
Jaime G Carbonell, Steve Klein, David Miller, Mike Steinbaum, Tomer Grassiany and Jochen Frey 2006 Context-based machine translation. The Association for Machine Translation in the Americas, pp. 19–28
ThuyLinh Nguyen and Stephan Vogel 2008 Context-based Arabic morphological analysis for machine translation. In: Proceedings of the Association for Computational Linguistics Twelfth Conference on ComputationalNatural Language Learning. pp. 135–142
Tynovsky M 2008 Hybrid approaches in machine translation. In: WDS Proceedings of Contributed Papers, Part-I. pp. 124–128
Sinha R M K and Jain A 2003 AnglaHindi: An English to Hindi machine-aided translation system. In: Proceedings of MT Summit IX. New Orleans, USA, pp. 494–497
Michael Carl, Cathrine Pease, Leonid L Iomdin and Oliver Streiter 2000 Towards a dynamic linkage of example-based and rule-based machine translation. Mach. Transl. 15(3): 223–257
Thenmozhi D and Aravindan C 2009 Tamil-English cross lingual information retrieval system for agriculture society. In: Proceedings of the International Forum for Information Technology in Tamil. pp. 173–178
Saraswathi S, Anusiya M, Kanivadhana P and Sathiya S 2011 Bilingual translation system for weather report. In: Proceedings of the International Conference on Advances in Computing and Communications. pp. 155–164
Sohail Asghar Tahir, Ghulam Rasool and Nayyer Masood 2010 Knowledge based machine translation. In: Proceedings of the IEEE International Conference on Information and Emerging Technologies. pp 1–5
Sergei Nirenburg 1989 Knowledge-based machine translation. Mach. Transl. 4(1): 5–24
Tarcisio Della Senta, Hiroshi Uchida and Meiying Zhu 2005 Universal networking language. UNDL Foundation, Tokyo, Japan
Amitabha Mukerjee, Achla M Raina, Kumar Kapil, Pankaj Goyal and Pushpraj Shukla 2003 Universal networking language: A tool for language independent semantics? Univ. Netw. Lang.: Adv. Theory Appl. (2003) 145–150
Jignashu Parikh, Dave Shachi and Pushpak Bhattacharyya 2001 Interlingua-based English Hindi machine translation and language divergence. Mach. Transl. 251–304
Hameed M S, Subalalitha C N, Geetha T V and Parthasarathi R 2012 A deconverter framework for Malayalam. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics. ACM, pp. 847–856
Kumar P and Sharma R K 2013 Punjabi DeConverter for generating Punjabi from Universal Networking Language. J. Zhejiang Univ. Sci. C 14(3): 179–196
Arif MdA 2011 Problems and prospects: Universal Networking Language on Bangla sentence structure perspective. Int. J. Eng. Tech. 11(4): 147
Manoj Jain and Om P Damani 2009 English to UNL (Interlingua) enconversion. In: Proceedings of the 2nd Conference on Language and Technology. pp. 1–8
Dhanabalan T and Geetha T V 2003 UNL deconverter for Tamil. In: International Conference on the Convergences of Knowledge, Culture, Language and Information Technologies
Balaji J, Geetha T V, Parthasarathi R and Karky M 2011 Morpho-semantic features for rule-based Tamil enconversion. Int. J. Comput. Appl. 26: 11–18
Michael Gilleland 2005 Levenshtein distance. Three flavors. http://www.merriampark.com/ld.htm
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sridhar, R., Sethuraman, P. & Krishnakumar, K. English to Tamil machine translation system using universal networking language. Sādhanā 41, 607–620 (2016). https://doi.org/10.1007/s12046-016-0504-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12046-016-0504-9