Journal of Zhejiang University SCIENCE C

, Volume 14, Issue 3, pp 179–196 | Cite as

Punjabi DeConverter for generating Punjabi from Universal Networking Language

Article

Abstract

DeConverter is core software in a Universal Networking Language (UNL) system. A UNL system has EnConverter and DeConverter as its two major components. EnConverter is used to convert a natural language sentence into an equivalent UNL expression, and DeConverter is used to generate a natural language sentence from an input UNL expression. This paper presents design and development of a Punjabi DeConverter. It describes five phases of the proposed Punjabi DeConverter, i.e., UNL parser, lexeme selection, morphology generation, function word insertion, and syntactic linearization. This paper also illustrates all these phases of the Punjabi DeConverter with a special focus on syntactic linearization issues of the Punjabi DeConverter. Syntactic linearization is the process of defining arrangements of words in generated output. The algorithms and pseudocodes for implementation of syntactic linearization of a simple UNL graph, a UNL graph with scope nodes and a node having un-traversed parents or multiple parents in a UNL graph have been discussed in this paper. Special cases of syntactic linearization with respect to Punjabi language for UNL relations like ‘and’, ‘or’, ‘fmt’, ‘cnt’, and ‘seq’ have also been presented in this paper. This paper also provides implementation results of the proposed Punjabi DeConverter. The DeConverter has been tested on 1000 UNL expressions by considering a Spanish UNL language server and agricultural domain threads developed by Indian Institute of Technology (IIT), Bombay, India, as gold-standards. The proposed system generates 89.0% grammatically correct sentences, 92.0% faithful sentences to the original sentences, and has a fluency score of 3.61 and an adequacy score of 3.70 on a 4-point scale. The system is also able to achieve a bilingual evaluation understudy (BLEU) score of 0.72.

Key words

DeConverter EnConverter Machine translation Universal Networking Language (UNL) Syntactic linearization 

CLC number

TP391 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bhattacharyya, P., 2001. Multilingual Information Processing Through Universal Networking Language. Indo UK Workshop on Language Engineering for South Asian Languages, p.1–10.Google Scholar
  2. Blanc, É., 2005. About and around the French EnConverter and the French DeConverter. Res. Comput. Sci., 12:157–166.Google Scholar
  3. Boguslavsky, I., Frid, N., Iomdin, L., Kreidlin, L., Sagalova, I., Sizov, V., 2000. Creating a Universal Networking Language Module within an Advanced NLP System. 18th Int. Conf. on Computational Linguistics, p.83–89.Google Scholar
  4. Boguslavsky, I., Cardeñosa, J., Gallardo, C., Iraola, L., 2005. The UNL initiative: an overview. LNCS, 3406:377–387.Google Scholar
  5. Daoud, M., 2005. Arabic generation in the framework of the Universal Networking Language. Res. Comput. Sci., 12: 195–209.Google Scholar
  6. Dave, S., Parikh, J., Bhattacharyya, P., 2001. Interlingua based English Hindi machine translation and language divergence. J. Mach. Transl., 16(4):251–304.CrossRefGoogle Scholar
  7. Dey, K., Bhattacharyya, P., 2005. Universal Networking Language based analysis and generation of Bengali case structure constructs. Res. Comput. Sci., 12:215–229.Google Scholar
  8. Dhanabalan, T., Geetha, V., 2003. UNL DeConverter for Tamil. Int. Conf. on the Convergence of Knowledge, Culture, Language and Information Technologies, p.1–6.Google Scholar
  9. Hrushikesh, B., 2002. Towards Marathi Sentence Generation from Universal Networking Language. MT Thesis, Indian Institute of Technology, Bombay, Mumbai.Google Scholar
  10. Keshari, B., Bista, K., 2005. UNL Nepali DeConverter. 3rd Int. Conf. on CALIBER, p.70–76.Google Scholar
  11. Lewis, M.P., 2009. Ethnologue: Languages of the World (16th Ed.). SIL International, Dallas.Google Scholar
  12. Linguistic Data Consortium (LDC), 2005. Linguistic Data Annotation Specification: Assessment of Adequacy and Fluency in Translations. Revision 1.5, Technical Report.Google Scholar
  13. Martins, T., Rino, M., Osvaldo, N., Hasegawa, R., Nunes, V., 1997. Specification of the UNL-Portuguese EnConverter-DeConverter Prototype. Relatórios Técnicos do ICMC-USP, p.1–10.Google Scholar
  14. Nalawade, A., 2007. Natural Language Generation from Universal Networking Language. MT Thesis, Indian Institute of Technology, Bombay, Mumbai.Google Scholar
  15. Pelizzoni, J., Nunes, M., 2005. Flexibility, configurability and optimality in UNL DeConversion via multiparadigm programming. Res. Comput. Sci., 12:175–194.Google Scholar
  16. Raman, S., Alwar, N., 1990. An AI-based approach to machine translation in Indian languages. Commun. ACM, 33(5): 521–527.CrossRefGoogle Scholar
  17. Shi, X., Chen, Y., 2005. A UNL DeConverter for Chinese Universal Network Language. Res. Comput. Sci., 12: 167–174.Google Scholar
  18. Singh, S., Dalal, M., Vachhani, V., Bhattacharyya, P., Damani, O.P., 2007. Hindi Generation from Interlingua. 17th MT Summit, Copenhagen, Denmark, p.1–8.Google Scholar
  19. Sinha, R., 2005. Hindi Generation: Syntax Planning and Case Marking. Mini Project Report, Indian Institute of Technology, Bombay, Mumbai.Google Scholar
  20. Uchida, H., 2005. Universal Networking Language (UNL): Specifications Version 2005, UNDL Foundation.Google Scholar
  21. Vachhani, V., 2006. UNL to Hindi DeConverter. BE Thesis, Dharamsinh Desai Institute of Technology, Nadiad.Google Scholar
  22. Vora, A., 2002. Generation of Hindi sentences from Universal Networking Language. BE Thesis, Dharamsinh Desai Institute of Technology, Nadiad.Google Scholar

Copyright information

© Journal of Zhejiang University Science Editorial Office and Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringThapar UniversityPatialaIndia
  2. 2.School of Mathematics & Computer ApplicationsThapar UniversityPatialaIndia

Personalised recommendations