Abstract
Arabic poetry is delicate literature that requires simultaneously adhering to a specific meter and rhyme. These restrictions make the computer unable to compose poems correctly. In the literature, relatively few studies focus on generating Arabic poems due to the challenges in processing and understanding the Arabic language. However, the recent advances in processing Arabic natural language, especially the rise of transformers, paved the way to make computers understand Arabic language in a better way and solve complex-related tasks. Therefore, we fine-tuned the most advanced Arabic pre-trained transformer called AraGPT2 based on a large poetry corpus. The AraGPT2 comes with different variants (Base, Medium, Large, and Mega). The employed corpus contains vast number of poems gathered from different eras, enabling us to train and fine-tune our model efficiently. We use the standard evaluation measures Perplexity and BLEU scores to validate the generated poems. In addition, we introduced new ways to measure the quality of generated Arabic poems and the models' capabilities. The BLEU results showed that our proposed model achieved a high score compared to previous benchmark studies. The expert evaluation showed that our mega model outperformed all previous work in fluency, meaning, coherence, and poetic scales. Another survey sent to volunteers with little background about poetry showed that verses generated by the mega model fooled about 68.9% of the respondents. To conclude, this study showed that the advances in processing natural Arabic language make generating Arabic poems with a particular meter and rhyme true. These results raise another concern about threats that might appear from using this tool in the wrong ways.
Similar content being viewed by others
References
Shaalan, K.; Siddiqui, S.; Alkhatib, M.; Abdel Monem, A.: Challenges in arabic natural language processing. Comput. Linguist. Speech Image Process Arabic Lang. (2018). https://doi.org/10.1142/9789813229396_0003
Hakami, A.; Alqarni, R.; Almutairi, M.; Alhothali, A.: Arabic Poems Generation using LSTM, Markov-LSTM and pre-trained GPT-2 models. Comput. Sci. Inf. Technol. (2021). https://doi.org/10.5121/csit.2021.111512
Al-shaibani, M.S.; Alyafeai, Z.; Ahmad, I.: Meter classification of Arabic poems using deep bidirectional recurrent neural networks. Pattern Recognit. Lett. (2020). https://doi.org/10.1016/j.patrec.2020.05.028
Al-Talabani, A.K.: “Automatic recognition of arabic poetry meter from speech signal using long short-term memory and support vector machine. ARO-Sci. J. Koya Univ. (2020). https://doi.org/10.14500/aro.10631
Talafha, S.; Rekabdar, B.: Arabic poem generation incorporating deep learning and phonetic CNNsubword embedding models. Int. J. Robot. Comput. (2019). https://doi.org/10.35708/tai1868-126246
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J.: Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings (2013)
Talafha, S.; Rekabdar, B.: Arabic poem generation with hierarchical recurrent attentional network. In Proceedings - 13th IEEE International Conference on Semantic Computing, ICSC 2019, (2019). doi: https://doi.org/10.1109/ICOSC.2019.8665603.
Beheitt, M. E. G.; ben Haj Hmida, M.: Automatic arabic poem generation with GPT-2, (2022). doi: https://doi.org/10.5220/0010847100003116.
Abandah, G.A.; Khedher, M.Z.; Abdel-Majeed, M.R.; Mansour, H.M.; Hulliel, S.F.; Bisharat, L.M.: Classifying and diacritizing Arabic poems using deep recurrent neural networks. J. King Saud Univ. Comput. Inf. Sci. (2022). https://doi.org/10.1016/j.jksuci.2020.12.002
Abuata, B.; Al-Omari, A.: A rule-based algorithm for the detection of Arud meter in classical Arabic poetry. Int. Arab J. Inf. Technol. 15(4), 1–5 (2018)
Sadat, F.; Mallek, F.; Sellami, R.; Boudabous, M. M.; Farzindar, A.: Collaboratively constructed linguistic resources for language variants and their exploitation in nlp applications—the case of Tunisian Arabic and the social media. In: Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014—in conjunction with 25th International Conference on Computational Linguistics, COLING 2014, pp. 102–110 (2014). https://doi.org/10.3115/v1/w14-5813
Nassif, A.B.; Elnagar, A.; Elgendy, O.; Afadar, Y.: Arabic fake news detection based on deep contextualized embedding models. Neural Comput. Appl. 34(18), 16019–16032 (2022). https://doi.org/10.1007/S00521-022-07206-4/TABLES/6
AlNagdawi, M.A.: Finding Arabic poem meter using context free grammar. J. Commun. Comput. Eng. 3(1), 2013 (2013). https://doi.org/10.20454/jcce.2013.600
Antoun, W.; Baly, F.; Hajj, H.: AraBERT: transformer-based model for arabic language understanding (2020), arXiv: 2003.00104
Lan, W.; Chen, Y.; Ritter, A.: Gigabert: Zero-shot transfer learning from english to Arabic. In: The 2020 Conference on Empirical Methods on Natural Language Processing (EMNLP). https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Gigabert%3A+Zero-shot+transfer+learning+from+english+to+arabic&btnG (2020). Accessed 07 Sep 2022.
Abdul-Mageed, M.; Elmadany, A. R.; Nagoudi, E. M. B.: ARBERT & MARBERT: deep bidirectional transformers for Arabic. In: ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 7088–7105. Preprint at 2101.01785 (2020)
Antoun, W.; Baly, F.; Hajj, H.: AraELECTRA: pre-training text discriminators for arabic language understanding. Preprint at 2012.15516 (2020)
Antoun, W.; Baly, F.; Hajj, H.: AraGPT2: pre-trained transformer for Arabic language generation. Preprint at 2012.15520 (2020)
Home—Poem Dataset. https://hci-lab.github.io/ArabicPoetry-1-Private/#PCD. Accessed 19 Sep 2022
Salloum, W.; Habash, N.: Elissa: a dialectal to standard arabic machine translation system. In: COLING Demo (2012)
Al-Gaphari, G.H.; Al-Yadoumi, M.: A method to convert Sana’ani accent to modern standard Arabic. Int. J. Inf. Sci. Manag. 8(1), 39–49 (2010)
Tachicart, R.; Bouzoubaa, K.: A hybrid approach to translate Moroccan Arabic dialect. In: 9th International Conference on Intelligent Systems: Theories and Applications, SITA 2014 (2014). Doi: https://doi.org/10.1109/SITA.2014.6847293
Baniata, L.H.; Park, S.; Park, S.B.: A multitask-based neural machine translation model with part-of-speech tags integration for Arabic dialects. Appl. Sci. (Switzerland) (2018). https://doi.org/10.3390/app8122502
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Acknowledgements
This research was supported by Princess Sumaya University for Technology (PSUT) providing free access to powerful servers. We would like to thank Ali Fadel and Mohammad Beheitt for their valuable contributions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Abboushi, O., Azzeh, M. Toward Fluent Arabic Poem Generation Based on Fine-tuning AraGPT2 Transformer. Arab J Sci Eng 48, 10537–10549 (2023). https://doi.org/10.1007/s13369-023-07692-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-023-07692-1