Skip to main content
Log in

Toward Fluent Arabic Poem Generation Based on Fine-tuning AraGPT2 Transformer

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Arabic poetry is delicate literature that requires simultaneously adhering to a specific meter and rhyme. These restrictions make the computer unable to compose poems correctly. In the literature, relatively few studies focus on generating Arabic poems due to the challenges in processing and understanding the Arabic language. However, the recent advances in processing Arabic natural language, especially the rise of transformers, paved the way to make computers understand Arabic language in a better way and solve complex-related tasks. Therefore, we fine-tuned the most advanced Arabic pre-trained transformer called AraGPT2 based on a large poetry corpus. The AraGPT2 comes with different variants (Base, Medium, Large, and Mega). The employed corpus contains vast number of poems gathered from different eras, enabling us to train and fine-tune our model efficiently. We use the standard evaluation measures Perplexity and BLEU scores to validate the generated poems. In addition, we introduced new ways to measure the quality of generated Arabic poems and the models' capabilities. The BLEU results showed that our proposed model achieved a high score compared to previous benchmark studies. The expert evaluation showed that our mega model outperformed all previous work in fluency, meaning, coherence, and poetic scales. Another survey sent to volunteers with little background about poetry showed that verses generated by the mega model fooled about 68.9% of the respondents. To conclude, this study showed that the advances in processing natural Arabic language make generating Arabic poems with a particular meter and rhyme true. These results raise another concern about threats that might appear from using this tool in the wrong ways.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Shaalan, K.; Siddiqui, S.; Alkhatib, M.; Abdel Monem, A.: Challenges in arabic natural language processing. Comput. Linguist. Speech Image Process Arabic Lang. (2018). https://doi.org/10.1142/9789813229396_0003

    Article  Google Scholar 

  2. Hakami, A.; Alqarni, R.; Almutairi, M.; Alhothali, A.: Arabic Poems Generation using LSTM, Markov-LSTM and pre-trained GPT-2 models. Comput. Sci. Inf. Technol. (2021). https://doi.org/10.5121/csit.2021.111512

    Article  Google Scholar 

  3. Al-shaibani, M.S.; Alyafeai, Z.; Ahmad, I.: Meter classification of Arabic poems using deep bidirectional recurrent neural networks. Pattern Recognit. Lett. (2020). https://doi.org/10.1016/j.patrec.2020.05.028

    Article  Google Scholar 

  4. Al-Talabani, A.K.: “Automatic recognition of arabic poetry meter from speech signal using long short-term memory and support vector machine. ARO-Sci. J. Koya Univ. (2020). https://doi.org/10.14500/aro.10631

    Article  Google Scholar 

  5. Talafha, S.; Rekabdar, B.: Arabic poem generation incorporating deep learning and phonetic CNNsubword embedding models. Int. J. Robot. Comput. (2019). https://doi.org/10.35708/tai1868-126246

    Article  Google Scholar 

  6. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J.: Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings (2013)

  7. Talafha, S.; Rekabdar, B.: Arabic poem generation with hierarchical recurrent attentional network. In Proceedings - 13th IEEE International Conference on Semantic Computing, ICSC 2019, (2019). doi: https://doi.org/10.1109/ICOSC.2019.8665603.

  8. Beheitt, M. E. G.; ben Haj Hmida, M.: Automatic arabic poem generation with GPT-2, (2022). doi: https://doi.org/10.5220/0010847100003116.

  9. Abandah, G.A.; Khedher, M.Z.; Abdel-Majeed, M.R.; Mansour, H.M.; Hulliel, S.F.; Bisharat, L.M.: Classifying and diacritizing Arabic poems using deep recurrent neural networks. J. King Saud Univ. Comput. Inf. Sci. (2022). https://doi.org/10.1016/j.jksuci.2020.12.002

    Article  Google Scholar 

  10. Abuata, B.; Al-Omari, A.: A rule-based algorithm for the detection of Arud meter in classical Arabic poetry. Int. Arab J. Inf. Technol. 15(4), 1–5 (2018)

    Google Scholar 

  11. Sadat, F.; Mallek, F.; Sellami, R.; Boudabous, M. M.; Farzindar, A.: Collaboratively constructed linguistic resources for language variants and their exploitation in nlp applications—the case of Tunisian Arabic and the social media. In: Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014—in conjunction with 25th International Conference on Computational Linguistics, COLING 2014, pp. 102–110 (2014). https://doi.org/10.3115/v1/w14-5813

  12. Nassif, A.B.; Elnagar, A.; Elgendy, O.; Afadar, Y.: Arabic fake news detection based on deep contextualized embedding models. Neural Comput. Appl. 34(18), 16019–16032 (2022). https://doi.org/10.1007/S00521-022-07206-4/TABLES/6

    Article  Google Scholar 

  13. AlNagdawi, M.A.: Finding Arabic poem meter using context free grammar. J. Commun. Comput. Eng. 3(1), 2013 (2013). https://doi.org/10.20454/jcce.2013.600

    Article  Google Scholar 

  14. Antoun, W.; Baly, F.; Hajj, H.: AraBERT: transformer-based model for arabic language understanding (2020), arXiv: 2003.00104

  15. Lan, W.; Chen, Y.; Ritter, A.: Gigabert: Zero-shot transfer learning from english to Arabic. In: The 2020 Conference on Empirical Methods on Natural Language Processing (EMNLP). https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Gigabert%3A+Zero-shot+transfer+learning+from+english+to+arabic&btnG (2020). Accessed 07 Sep 2022.

  16. Abdul-Mageed, M.; Elmadany, A. R.; Nagoudi, E. M. B.: ARBERT & MARBERT: deep bidirectional transformers for Arabic. In: ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 7088–7105. Preprint at 2101.01785 (2020)

  17. Antoun, W.; Baly, F.; Hajj, H.: AraELECTRA: pre-training text discriminators for arabic language understanding. Preprint at 2012.15516 (2020)

  18. Antoun, W.; Baly, F.; Hajj, H.: AraGPT2: pre-trained transformer for Arabic language generation. Preprint at 2012.15520 (2020)

  19. Home—Poem Dataset. https://hci-lab.github.io/ArabicPoetry-1-Private/#PCD. Accessed 19 Sep 2022

  20. Salloum, W.; Habash, N.: Elissa: a dialectal to standard arabic machine translation system. In: COLING Demo (2012)

  21. Al-Gaphari, G.H.; Al-Yadoumi, M.: A method to convert Sana’ani accent to modern standard Arabic. Int. J. Inf. Sci. Manag. 8(1), 39–49 (2010)

    Google Scholar 

  22. Tachicart, R.; Bouzoubaa, K.: A hybrid approach to translate Moroccan Arabic dialect. In: 9th International Conference on Intelligent Systems: Theories and Applications, SITA 2014 (2014). Doi: https://doi.org/10.1109/SITA.2014.6847293

  23. Baniata, L.H.; Park, S.; Park, S.B.: A multitask-based neural machine translation model with part-of-speech tags integration for Arabic dialects. Appl. Sci. (Switzerland) (2018). https://doi.org/10.3390/app8122502

    Article  Google Scholar 

  24. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

  25. Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

Download references

Acknowledgements

This research was supported by Princess Sumaya University for Technology (PSUT) providing free access to powerful servers. We would like to thank Ali Fadel and Mohammad Beheitt for their valuable contributions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Azzeh.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abboushi, O., Azzeh, M. Toward Fluent Arabic Poem Generation Based on Fine-tuning AraGPT2 Transformer. Arab J Sci Eng 48, 10537–10549 (2023). https://doi.org/10.1007/s13369-023-07692-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-023-07692-1

Keywords

Navigation