Skip to main content

Commonsense Validation and Explanation for Arabic Sentences

  • Conference paper
  • First Online:
Emerging Trends and Applications in Artificial Intelligence ( ICETAI 2023)

Abstract

Commonsense understanding poses a significant challenge, especially in complex languages like Arabic. However, recent advancements in deep learning have facilitated improvements in various language tasks, including the ability to distinguish commonsense in sentences. This research focuses on participating in the SemEval 2020 Task 4 (ComVE) competition by developing classification and text generation models tailored for the Arabic language. The competition comprises three subtasks: Subtask A involves choosing the sentence that makes sense between two given sentences, Subtask B requires selecting the most appropriate reason from multiple choices for a sentence that goes against common sense, and Subtask C entails generating an explanation and reason for a sentence violating common sense. Our models leverage a set of multilingual pre-trained transformer models and have achieved remarkable performance in the competition. In Subtask A, our accuracy reached 84.7%, surpassing the performance of other works in Arabic. Similarly, in Subtask B, our approach outperformed other multilingual approaches, achieving a score of 79.3% compared to the state-of-the-art BERT model’s 61%. In Subtask C, our model generated explanations with a BLEU score of 24, which is considered acceptable in the domain of text generation, particularly in the context of Arabic.

Supported by organization Jordan University of Science and Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/ibrahim810/commonsense_ar_googleAPITranslate/.

References

  1. Al-Bashabsheh, E., Al-Khazaleh, H., Elayan, O., Duwairi, R.: Commonsense validation for Arabic sentences using deep learning. In: 2021 22nd International Arab Conference on Information Technology (ACIT), pp. 1–7. IEEE (2021)

    Google Scholar 

  2. AL-Tawalbeh, S., AL-Smadi, M.: A benchmark Arabic dataset for commonsense explanation. arXiv preprint arXiv:2012.10251 (2020)

  3. Alshanik, F., Apon, A., Herzog, A., Safro, I., Sybrandt, J.: Accelerating text mining using domain-specific stop word lists. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2639–2648. IEEE (2020)

    Google Scholar 

  4. Antoun, W., Baly, F., Hajj, H.: Arabert: transformer-based model for Arabic language understanding. In: LREC 2020 Workshop Language Resources and Evaluation Conference 11–16 May 2020, p. 9 (2020)

    Google Scholar 

  5. Antoun, W., Baly, F., Hajj, H.: AraGPT2:pPre-trained transformer for Arabic language generation. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop, pp. 196–207. Association for Computational Linguistics, Kyiv, Ukraine (Virtual) (2021). https://www.aclweb.org/anthology/2021.wanlp-1.21

  6. Darwish, K., Mubarak, H.: Farasa: a new fast and accurate Arabic word segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA) (2016)

    Google Scholar 

  7. Davis, E.: Logical formalizations of commonsense reasoning: a survey. J. Artif. Intell. Res. 59, 651–723 (2017)

    Article  MathSciNet  Google Scholar 

  8. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805

  9. Fadel, A., Al-Ayyoub, M., Cambria, E.: Justers at semeval-2020 task 4: evaluating transformer models against commonsense validation and explanation. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 535–542 (2020)

    Google Scholar 

  10. Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., Nouvel, D.: Arabic natural language processing: an overview. J. King Saud Univ. Comput. Inf. Sci. 33(5), 497–507 (2021)

    Google Scholar 

  11. Jon, J., Fajčík, M., Dočekal, M., Smrž, P.: But-fit at semeval-2020 task 4: Multilingual commonsense. arXiv preprint arXiv:2008.07259 (2020)

  12. Mohammed, R., Abdullah, M.: Teamjust at semeval-2020 task 4: Commonsense validation and explanation using ensembling techniques. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 594–600 (2020)

    Google Scholar 

  13. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog (2019)

    Google Scholar 

  14. Saeedi, S., Panahi, A., Saeedi, S., Fong, A.C.: CS-NLP team at SemEval-2020 Task 4: evaluation of state-of-the-art NLP deep learning architectures on commonsense reasoning task. arXiv preprint arXiv:2006.01205 (2020)

  15. Tawalbeh, S., Al-Smadi, M.: Is this sentence valid? an Arabic dataset for commonsense validation. arXiv preprint arXiv:2008.10873 (2020)

  16. Wang, C., Liang, S., Jin, Y., Wang, Y., Zhu, X., Zhang, Y.: SemEval-2020 task 4: commonsense validation and explanation. In: Proceedings of The 14th International Workshop on Semantic Evaluation. Association for Computational Linguistics (2020)

    Google Scholar 

  17. Wang, C., Liang, S., Zhang, Y., Li, X., Gao, T.: Does it make sense? and why? a pilot study for sense making and explanation. arXiv preprint arXiv:1906.00363 (2019)

  18. Wang, H., et al.: Cuhk at semeval-2020 task 4: commonsense explanation, reasoning and prediction with multi-task learning. arXiv preprint arXiv:2006.09161 (2020)

  19. Zeroual, I., Goldhahn, D., Eckart, T., Lakhouaja, A.: OSIAN: open source international Arabic news corpus - preparation and integration into the CLARIN-infrastructure. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop, pp. 175–182. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.18653/v1/W19-4619, https://aclanthology.org/W19-4619

  20. Zhao, Q., Tao, S., Zhou, J., Wang, L., Lin, X., He, L.: Ecnu-sensemaker at semeval-2020 task 4: Leveraging heterogeneous knowledge resources for commonsense validation and explanation. arXiv preprint arXiv:2007.14200 (2020)

  21. Zhou, M., Duan, N., Liu, S., Shum, H.Y.: Progress in neural NLP: modeling, learning, and reasoning. Engineering 6(3), 275–290 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farah Alshanik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alshanik, F., Al-Sharif, I., Abdullah, M.W. (2024). Commonsense Validation and Explanation for Arabic Sentences. In: García Márquez, F.P., Jamil, A., Hameed, A.A., Segovia Ramírez, I. (eds) Emerging Trends and Applications in Artificial Intelligence. ICETAI 2023. Lecture Notes in Networks and Systems, vol 960. Springer, Cham. https://doi.org/10.1007/978-3-031-56728-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56728-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56727-8

  • Online ISBN: 978-3-031-56728-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics