Towards an Arabic Text Summaries Evaluation Based on AraBERT Model

Ellouze, Samira; Jaoua, Maher

doi:10.1007/978-3-031-05760-1_4

Samira Ellouze⁹ &
Maher Jaoua⁹

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 446))

Included in the following conference series:

International Conference on Research Challenges in Information Science

1783 Accesses
1 Citations

Abstract

The evaluation of text summaries remains a challenging task despite the large number of studies in this field for more than two decades. This paper describes an automatic method for assessing Arabic text summaries. In fact, the proposed method will predict the “Overall Responsiveness” manual score, which is a combination of the content and the linguistic quality of a summary. To predict this manual score, we aggregate, with a regression function, three types of features: lexical similarity features, semantic similarity features and linguistic features. Semantic features include multiple semantic similarity scores based on Bert model. While linguistic features are based on the calculation of entropy scores. To calculate the similarity between a candidate summary and a reference summary, we begin by doing an exact match between n-grams. For the unmatched n-grams, we present them as Bert vectors, and then we compute the similarity between Bert vectors. The proposed method yielded competitive results compared to metrics based on lexical similarity such as ROUGE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL, Barcelona, Spain, pp.74–81 (2004)
Google Scholar
Giannakopoulos, G., Karkaletsis, V.: AutoSummENG and MeMoG in evaluating guided summaries. In: Proceedings of the Text Analysis Conference (TAC) (2011)
Google Scholar
Cabrera-Diego, L.A., Torres-Moreno, J.: Summtriver: a new trivergent model to evaluate summaries automatically without human references. Data Knowl. Eng. 113, 184–197 (2018)
Article Google Scholar
Pitler, E., Nenkova, A.: Revisiting readability: a unified framework for predicting text quality. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), pp. 186–195 (2008)
Google Scholar
Pitler, E., Louis, A., Nenkova, A.: Automatic evaluation of linguistic quality in multi-document summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 544–554 (2010)
Google Scholar
de S. Dias, M., Feltrim, V.D., Pardo, T.A.S.: Using rhetorical structure theory and entity grids to automatically evaluate local coherence in texts. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.D.G. (eds.) PROPOR 2014. LNCS (LNAI), vol. 8775, pp. 232–243. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09761-9_26
Chapter Google Scholar
Ellouze, S., Jaoua, M., Belguith, L.H.: Automatic evaluation of a summary’s linguistic quality. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2016. LNCS, vol. 9612, pp. 392–400. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41754-7_39
Chapter Google Scholar
Xenouleas, S., Malakasiotis, P., Apidianaki M., Androutsopoulos I.: Sum-QE: a BERT-based summary quality estimation model. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 6004–6010 (2019)
Google Scholar
Lin, Z., Liu, C., Ng, H.T., Kan, M.Y.: Combining coherence models and machine translation evaluation metrics for summarization evaluation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1006–1014 (2012)
Google Scholar
Ellouze, S., Jaoua, M., Hadrich Belguith, L.: Mix multiple features to evaluate the content and the linguistic quality of text summaries. J. Comput. Inf. Technol. 25(2), 149–166 (2017)
Google Scholar
Wang, X., Liu, B., Shen, L., Li, Y., Gu, R., Qu, G.: A summary evaluation method combining linguistic quality and semantic similarity. In: Proceedings of 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 637–642 (2020). https://doi.org/10.1109/CSCI51800.2020.00113
Elghannam, F., El-Shishtawy, T.: Keyphrase based evaluation of automatic text summarization. Int. J. Comput. Appl. 117(7), 5–8 (2015)
Google Scholar
Ellouze, S., Jaoua, M., Hadrich Belguith, L.: Arabic text summary evaluation method. In: Proceedings of the International Business Information Management Association Conference-Education Excellence and Innovation Management through Vision2020: From Regional Development Sustainability to Global Economic Growth, pp. 3532–3541 (2017)
Google Scholar
Attia, M.: Handling Arabic morphological and syntactic ambiguities within the LFG framework with a view to machine translation. Ph.D. dissertation, University of Manchester (2008)
Google Scholar
Farghaly, A., Shaalan, K.: Arabic natural language processing: challenges and solutions. ACM Trans. Asian Lang. Inf. Process. 8(4) (2009). https://doi.org/10.1145/1644879.1644881. Article 14, 22 pages
Farghaly, A.: Subject pronoun deletion rule. In: Proceedings of the English Language Symposium on Discourse Analysis (LSDA 1982), pp. 110–117 (1982)
Google Scholar
Hovy, E., Lin, C., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic elements. In: Proceedings of the Conference on Language Resources and Evaluation, pp. 899–902 (2006)
Google Scholar
Tratz, S., Hovy, E.: BEwTE: basic elements with transformations for evaluation. In: Proceedings of Text Analysis Conference (TAC) Workshop (2008)
Google Scholar
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. In: Proceedings of the International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: The Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Quoc, V.Le.: XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 5753–5763 (2019)
Google Scholar
Giannakopoulos, G., Karkaletsis V.: Summary evaluation: together we stand NPowER-ed. In: Proceedings of International Conference on Computational Linguistics and Intelligent Text Processing, vol. 2, pp. 436–450 (2013)
Google Scholar
Bentz, C., Alikaniotis, D., Cysouw, M., Ferrer-i-Cancho, R.: The entropy of words—learnability and expressivity across more than 1000 languages. Entropy 19(6), 275 (2017)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC 2011 multiling pilot overview. In: Proceedings of the Fourth Text Analysis Conference (2011)
Google Scholar
Giannakopoulos, G.: Multi-document multi-lingual summarization and evaluation tracks in ACL’acl 2013 multiling workshop’. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization, pp. 20–28 (2013)
Google Scholar
Louis, A., Nenkova, A.: Automatically assessing machine summary content without a gold standard. Comput. Linguist. 39(2), 267–300 (2013). https://doi.org/10.1162/COLI_a_00123
Antoun, W., Baly, F., Hajj, H.: AraBERT: Transformer-based model for Arabic language understanding. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 9–15 (2020)
Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002). https://doi.org/10.1023/A:1012487302797

Download references

Author information

Authors and Affiliations

ANLP Research Group, MIRACL Laboratory, University of Sfax, Sfax, Tunisia
Samira Ellouze & Maher Jaoua

Authors

Samira Ellouze
View author publications
You can also search for this author in PubMed Google Scholar
Maher Jaoua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samira Ellouze .

Editor information

Editors and Affiliations

University of Twente, Enschede, The Netherlands
Renata Guizzardi
University of Geneva, CUI, Carouge, Switzerland
Jolita Ralyté
Polytechnic University of Catalonia, Barcelona, Spain
Xavier Franch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ellouze, S., Jaoua, M. (2022). Towards an Arabic Text Summaries Evaluation Based on AraBERT Model. In: Guizzardi, R., Ralyté, J., Franch, X. (eds) Research Challenges in Information Science. RCIS 2022. Lecture Notes in Business Information Processing, vol 446. Springer, Cham. https://doi.org/10.1007/978-3-031-05760-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-05760-1_4
Published: 14 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05759-5
Online ISBN: 978-3-031-05760-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics