Skip to main content

Single Arabic Document Summarization Using Natural Language Processing Technique

  • Chapter
  • First Online:
Recent Advances in NLP: The Case of Arabic Language

Part of the book series: Studies in Computational Intelligence ((SCI,volume 874))

Abstract

This paper presents a method based on natural language processing (NLP) for single Arabic document summarization. The suggested method based on the extractive method to select the most valuable information in the document. However, working with Arabic text is considered as a challenging task, this chapter tries to produce an accurate result by using some of NLP techniques. The proposed method is formed from three phases, the first one work as a pre-processing phase to unify synonyms terms, stemming, remove punctuation marks and remove text decoration. Consequently, it produces the features vectors and scores these features to start to select the clauses with the highest scores then marks it as important clauses. The suggested method’s results are compared versus the traditional methods. In this context, two human experts summarized all the datasets manually in order to prepare a strong compare and effective evaluation of the suggested method. In the evaluation phase, some of the performance measures include accuracy, precision, recall, f-measure, and Rouge measure are used. The experimental results denoted that the suggested method showed a competitive execution compared with the human experts in summarization ratio as well as in the accuracy of the produced document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A. Nenkova, K. Mckeown, Automatic Summarization (USA, 2011), p. 1

    Google Scholar 

  2. S. Suneetha, automatic text summarization: the current state of the art. Int. J. Sci. Adv. Technol. 1(9), (2011), ISSN: 2221-8386

    Google Scholar 

  3. R. Mol, Sabeeha: an automatic document summarization system using a fusion method. Int. Res. J. Eng. Technol. (IRJET), 3 (2016), ISSN: 2395-0056

    Google Scholar 

  4. Y. Rajput, P. Saxena, A combined approach for effective text mining using node clustering. Int. J. Adv. Res. Comput. Commun. Eng. 5(4), 321–324 (2016), ISSN: 2319 5940

    Google Scholar 

  5. N. Bhatia, A. Jaiswal, Literature review on automatic text summarization: single and multiple summarizations. Int. J. Comput. Appl. (IJCA) 117(6), 0975–8887 (2016)

    Article  Google Scholar 

  6. D. Radev, S. Teufel, H. Saggion, W. Lam J. Blitzer A. Celebi, et al., Evaluation of text summarization in a cross-lingual information retrieval framework, (2011)

    Google Scholar 

  7. S. Lagrini, M. Redjimi, N. Azizi, Automatic arabic text summarization approaches. Int. J. Computer Appl. 164(5) (2017)

    Article  Google Scholar 

  8. A. Al-Saleh, M. Menail, Automatic Arabic text summarization: a survey. Artif. Intell. Rev. Arch 45(2), 203–234 (2016)

    Article  Google Scholar 

  9. M. Tafiqe, Y. Farag, M. Younis, Comparative and Contrastive Linguistics (Cairo University, 2014)

    Google Scholar 

  10. A. Basiony, Computer for extracting knowledge and opinion mining (Dar El Kotb El-elmia for publishing, Cairo-Egypt, 2011)

    Google Scholar 

  11. H. Oufaida, O. Noualib, P. Blache, Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization. J. King Saud Univ.-Comput. Inf. Sci. 450–461 (2014)

    Article  Google Scholar 

  12. K. Merchant, Y. Pande, NLP based latent semantic analysis for legal text summarization, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (IEEE, 2018), pp. 1803–1807

    Google Scholar 

  13. A. Khan, N. Salim, H. Farman, M. Khan, B. Jan, A. Ahmad, A. Paul, Abstractive text summarization based on improved semantic graph approach. Int. J. Parallel Prog. 46(5), 992–1016 (2018)

    Article  Google Scholar 

  14. D.B. Patel,, S. Shah, H.R. Chhinkaniwala, Fuzzy logic based multi Document Summarization with improved sentence scoring and redundancy removal technique, Expert. Syst. Appl. (2019)

    Google Scholar 

  15. M.R. Chaud, A. Di Felippo, Exploring content selection strategies for multilingual multi-document summarization based on the universal network language (UNL). Revista de Estudos da Linguagem 26(1), 45–71 (2018)

    Article  Google Scholar 

  16. Cagliero, L., Garza, P., Baralis, E.: ELSA: a multilingual document summarization algorithm based on frequent itemsets and latent semantic analysis. ACM Trans. Inf. Syst. (TOIS),  37(2) (2019)‏

    Article  Google Scholar 

  17. S. Narayan, S.B. Cohen, M. Lapata, Ranking sentences for extractive summarization with reinforcement learning. arXiv preprint arXiv:1802.08636 (2018)

  18. C. Kedzie, K. McKeown, H. Daume III, Content selection in deep learning models of summarization, arXiv preprint arXiv:1810.12343 (2018)

  19. S. Song, H. Huang, T. Ruan, Abstractive text summarization using LSTM-CNN based deep learning. Multimed. Tools Appl. 78(1), 857–875 (2019)

    Article  Google Scholar 

  20. M.S. Bewoor, S.H. Patil, Empirical analysis of single and multi document summarization using clustering algorithms. Eng., Technol. Appl. Sci. Res. 8(1), 2562–2567 (2018)

    Google Scholar 

  21. H. Van Lierde, T.W. Chow, Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization. Inf. Sci. 496, 212–224 (2019)

    Article  Google Scholar 

  22. P. Wu, Q. Zhou, Z. Lei, W. Qiu, X. Li: Template oriented text summarization via knowledge graph, in 2018 International Conference on Audio, Language and Image Processing (ICALIP) (IEEE, 2018), pp. 79–83

    Google Scholar 

  23. Y. Wu, R. Chen, C. Li, S. Chen, W. Zou, Automatic summarization generation technology of network document based on knowledge graph, in International Conference on Advanced Hybrid Information Processing, (Springer, Cham, 2018), pp. 20–27

    Google Scholar 

  24. C. Mallick, A.K. Das, M. Dutta, A.K. Das, A. Sarkar, Graph-based text summarization using modified TextRank, in Soft Computing in Data Analytics, (Springer, Singapore, 2019), pp. 137–146

    Google Scholar 

  25. A. Cohan, N. Goharian, Scientific article summarization using citation-context and article’s discourse structure. arXiv preprint arXiv:1704.06619 (2017)

  26. X. Wang, Y. Yoshida, T. Hirao, K. Sudoh, M. Nagata, Summarization based on task-oriented discourse parsing. IEEE Trans. Audio Speech Lang. Process. 23(8), 1358–1367 (2015)

    Article  Google Scholar 

  27. R. Rautray, R.C. Balabantaray, Cat swarm optimization based evolutionary framework for multi document summarization. Phys. A 477, 174–186 (2017)

    Article  Google Scholar 

  28. J.M. Sanchez-Gomez, M.A. Vega-Rodríguez, C.J. Pérez, Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach. Knowl.-Based Syst. 159, 1–8 (2018)

    Article  Google Scholar 

  29. M.A. Mosa, A.S. Anwar, A. Hamouda, A survey of multiple types of text summarization based on swarm intelligence optimization techniques (2018)

    Google Scholar 

  30. L. Suanmali, N. Salim, M.S. Binwahlan, Genetic algorithm based sentence extraction for text summarization. Int. J. Innov. Comput. 1(1), (2011)

    Google Scholar 

  31. Keskes, I., Lhioui, M., Benamara, F., Belguith, L.: Automatic summarization of Arabic texts biased on segmented discourse representation theory international computing conference in Arabic (ICCA, 26–28 December, Egypt 2012)

    Google Scholar 

  32. K. Nandhini, S.R. Balasundaram, Use of genetic algorithm for cohesive summary extraction to assist reading difficulties. Appl. Comput. Intell. Soft Comput. (2013)

    Google Scholar 

  33. F.G. El Sherief, Towards A Hybrid Framework for Automatic Arabic Summarizer, Unpublished Ph.D’s thesis, Faculty of Computer and Information, Cairo University (2015)

    Google Scholar 

  34. H. Froud, A. Lachkar, S. Ouatik, Arabic text summarization based on latent semantic analysis to enhance arabic documents clustering. Colloq. Inf. Sci. Technol. (CIST) 22–24 October (2016)

    Google Scholar 

  35. Y.A. Jaradat, A.T. Al-Taani, Hybrid-based Arabic single-document text summarization approach using genatic algorithm, in 2016 7th International Conference on Information and Communication Systems (ICICS), (IEEE, 2016), pp. 85–91

    Google Scholar 

  36. R.S. Baraka, S.N. Al Breem, Automatic arabic text summarization for large scale multiple documents using genetic algorithm and mapreduce, in 2017 Palestinian International Conference on Information and Communication Technology (PICICT), (IEEE, 2017), pp. 40–45

    Google Scholar 

  37. A.M. Azmi, N.I. Altmami, An abstractive Arabic text summarizer with user controlled granularity. Inf. Process. Manage. 54(6), 903–921 (2018)

    Article  Google Scholar 

  38. Y.C. Shekhar, A. Sharan, Hybrid approach for single text document summarization using statistical and sentiment features. Int. J. Inf. Retr. Res. (IJIRR), 46–70 (2015)

    Google Scholar 

  39. Y.K. Menna, D. Gopalani, Feature priority based sentence filtering method for extractive automatic text Summarization (2015)

    Google Scholar 

  40. J. Singh, V. Gupta, A systematic review of text stemming techniques (2016)

    Google Scholar 

  41. A. Haboush, A. Momani, M. Al-Zoubi, M. Tarazi: Arabic text summarization model using clustering techniques. World Comput. Sci. Inf. Technol. J. WCSIT, 2(3) 62–67 (2012)

    Google Scholar 

  42. M.M. Refaat, A.A. Ewees, M.M. Eisa, A.A. Sallam, Automated assessment of students’ arabic free-text answers. Int. J. Intell. Comput. Inf. Sci. 12(1), 213–222 (2012)

    Google Scholar 

  43. N. El-Fishawy, A. Hamouda, G. Attiya, M. Atef, Arabic summarization in Twitter social network. Ain Shams Eng. J. 5(2), 411–420 (2014)

    Article  Google Scholar 

  44. A.A. Ewees, M. Eisa, M.M. Refaat, Comparison of cosine similarity and k-NN for automated essays scoring. Cogn. Process. 3(12) (2014)

    Google Scholar 

  45. R.A. Ibrahim, et al., Galaxy images classification using hybrid brain storm optimization with moth flame optimization. J. Astron. Telesc., Instrum., Syst. 4(3), 038001 (2018)

    Article  Google Scholar 

  46. E.H. Houssein, A.E. Ahmed, Mohamed Abd ElAziz. Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recognit. Image Anal. 28(2), 243–253 (2018)

    Article  Google Scholar 

  47. M Abd Elaziz, A.A. Ewees, A.E. Hassanien, Multi-objective whale optimization algorithm for content-based image retrieval. Multimed. Tools Appl. 77(19), 26135–26172 (2018)

    Google Scholar 

  48. M. Boudabous, M. Maaloul, I. Keskes, L. Belguith. Automatic summarization of arabic texts between digital learning theory and rhetorical structure theory. Commun. ACS, 4(2) (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed A. Ewees .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bialy, A.A., Gaheen, M.A., ElEraky, R.M., ElGamal, A.F., Ewees, A.A. (2020). Single Arabic Document Summarization Using Natural Language Processing Technique. In: Abd Elaziz, M., Al-qaness, M., Ewees, A., Dahou, A. (eds) Recent Advances in NLP: The Case of Arabic Language. Studies in Computational Intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_2

Download citation

Publish with us

Policies and ethics