Abstract
Commonsense understanding poses a significant challenge, especially in complex languages like Arabic. However, recent advancements in deep learning have facilitated improvements in various language tasks, including the ability to distinguish commonsense in sentences. This research focuses on participating in the SemEval 2020 Task 4 (ComVE) competition by developing classification and text generation models tailored for the Arabic language. The competition comprises three subtasks: Subtask A involves choosing the sentence that makes sense between two given sentences, Subtask B requires selecting the most appropriate reason from multiple choices for a sentence that goes against common sense, and Subtask C entails generating an explanation and reason for a sentence violating common sense. Our models leverage a set of multilingual pre-trained transformer models and have achieved remarkable performance in the competition. In Subtask A, our accuracy reached 84.7%, surpassing the performance of other works in Arabic. Similarly, in Subtask B, our approach outperformed other multilingual approaches, achieving a score of 79.3% compared to the state-of-the-art BERT model’s 61%. In Subtask C, our model generated explanations with a BLEU score of 24, which is considered acceptable in the domain of text generation, particularly in the context of Arabic.
Supported by organization Jordan University of Science and Technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Bashabsheh, E., Al-Khazaleh, H., Elayan, O., Duwairi, R.: Commonsense validation for Arabic sentences using deep learning. In: 2021 22nd International Arab Conference on Information Technology (ACIT), pp. 1–7. IEEE (2021)
AL-Tawalbeh, S., AL-Smadi, M.: A benchmark Arabic dataset for commonsense explanation. arXiv preprint arXiv:2012.10251 (2020)
Alshanik, F., Apon, A., Herzog, A., Safro, I., Sybrandt, J.: Accelerating text mining using domain-specific stop word lists. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2639–2648. IEEE (2020)
Antoun, W., Baly, F., Hajj, H.: Arabert: transformer-based model for Arabic language understanding. In: LREC 2020 Workshop Language Resources and Evaluation Conference 11–16 May 2020, p. 9 (2020)
Antoun, W., Baly, F., Hajj, H.: AraGPT2:pPre-trained transformer for Arabic language generation. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop, pp. 196–207. Association for Computational Linguistics, Kyiv, Ukraine (Virtual) (2021). https://www.aclweb.org/anthology/2021.wanlp-1.21
Darwish, K., Mubarak, H.: Farasa: a new fast and accurate Arabic word segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA) (2016)
Davis, E.: Logical formalizations of commonsense reasoning: a survey. J. Artif. Intell. Res. 59, 651–723 (2017)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Fadel, A., Al-Ayyoub, M., Cambria, E.: Justers at semeval-2020 task 4: evaluating transformer models against commonsense validation and explanation. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 535–542 (2020)
Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., Nouvel, D.: Arabic natural language processing: an overview. J. King Saud Univ. Comput. Inf. Sci. 33(5), 497–507 (2021)
Jon, J., Fajčík, M., Dočekal, M., Smrž, P.: But-fit at semeval-2020 task 4: Multilingual commonsense. arXiv preprint arXiv:2008.07259 (2020)
Mohammed, R., Abdullah, M.: Teamjust at semeval-2020 task 4: Commonsense validation and explanation using ensembling techniques. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 594–600 (2020)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog (2019)
Saeedi, S., Panahi, A., Saeedi, S., Fong, A.C.: CS-NLP team at SemEval-2020 Task 4: evaluation of state-of-the-art NLP deep learning architectures on commonsense reasoning task. arXiv preprint arXiv:2006.01205 (2020)
Tawalbeh, S., Al-Smadi, M.: Is this sentence valid? an Arabic dataset for commonsense validation. arXiv preprint arXiv:2008.10873 (2020)
Wang, C., Liang, S., Jin, Y., Wang, Y., Zhu, X., Zhang, Y.: SemEval-2020 task 4: commonsense validation and explanation. In: Proceedings of The 14th International Workshop on Semantic Evaluation. Association for Computational Linguistics (2020)
Wang, C., Liang, S., Zhang, Y., Li, X., Gao, T.: Does it make sense? and why? a pilot study for sense making and explanation. arXiv preprint arXiv:1906.00363 (2019)
Wang, H., et al.: Cuhk at semeval-2020 task 4: commonsense explanation, reasoning and prediction with multi-task learning. arXiv preprint arXiv:2006.09161 (2020)
Zeroual, I., Goldhahn, D., Eckart, T., Lakhouaja, A.: OSIAN: open source international Arabic news corpus - preparation and integration into the CLARIN-infrastructure. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop, pp. 175–182. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.18653/v1/W19-4619, https://aclanthology.org/W19-4619
Zhao, Q., Tao, S., Zhou, J., Wang, L., Lin, X., He, L.: Ecnu-sensemaker at semeval-2020 task 4: Leveraging heterogeneous knowledge resources for commonsense validation and explanation. arXiv preprint arXiv:2007.14200 (2020)
Zhou, M., Duan, N., Liu, S., Shum, H.Y.: Progress in neural NLP: modeling, learning, and reasoning. Engineering 6(3), 275–290 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Alshanik, F., Al-Sharif, I., Abdullah, M.W. (2024). Commonsense Validation and Explanation for Arabic Sentences. In: García Márquez, F.P., Jamil, A., Hameed, A.A., Segovia Ramírez, I. (eds) Emerging Trends and Applications in Artificial Intelligence. ICETAI 2023. Lecture Notes in Networks and Systems, vol 960. Springer, Cham. https://doi.org/10.1007/978-3-031-56728-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-56728-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56727-8
Online ISBN: 978-3-031-56728-5
eBook Packages: EngineeringEngineering (R0)