Skip to main content

Extractive Text Summarization Using Topic Modelling and Entropy

  • Conference paper
  • First Online:
Proceedings of the 2nd International Conference on Cognitive and Intelligent Computing (ICCIC 2022)

Abstract

In Natural Language Processing, text summarization is one of the prominent applications in digital trend for extracting information from a single or multiple documents and making a summary of the document/documents. There has been an extensive study on extractive summarization process. Here, in the proposed model, topics are generated by using topic modelling techniques like LDA or HDP, Probabilities are generated after topics are classified with selected optimal number of topics, using the probabilities of the term for each topic the sentence probability is calculated using term probability and entropies are calculated for topic and sentence, Sentence scores are calculated by taking Sentence entropy in topic spaces of the topic which has highest entropy in term space and similarity score between the sentences. The top sentences that are having score which is more than threshold score those sentences are considered and if any duplicates are present in the list of sentences, those duplicates are removed, remaining sentences are joined to form a summary and evaluated against the target summary using ROUGE metrics. The LDA model achieved the accuracy of 75% as compared with state-of-the art methods by considering the WikiHow dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ailem M, Zhang B, Sha F (2019) Topic augmented generator for abstractive summarization [online]. Available at http://arxiv.org/abs/1908.07026

  2. Issam KAR, Patel S (2020) Topic modeling based extractive text summarization. Int J Innov Technol Explor Eng 96:1710–1719

    Google Scholar 

  3. Madhuri JN, Ganesh Kumar R (2019) Extractive text summarization using sentence ranking. In: 2019 international conference on data science and communication. IconDSC 2019, pp 3–5

    Google Scholar 

  4. Alami N, Meknassi M, En-nahnahi N, El Adlouni Y, Ammor O (2021) Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling. Expert Syst Appl 172:114652. Available at: https://doi.org/10.1016/j.eswa.2021.114652

  5. Mohd M, Jan R, Shah M (2020) Text document summarization using word embedding. Expert Syst Appl 143:112958. Available at https://doi.org/10.1016/j.eswa.2019.112958

  6. Srivastava R, Singh P, Rana KPS, Kumar V (2022) A topic modelled unsupervised approach to single document extractive text summarization. Knowl-Based Syst 246:108636. Available at https://doi.org/10.1016/j.knosys.2022.108636

  7. Lamsiyah S, El Mahdaouy A, Espinasse B, El Alaoui Ouatik S (2021) An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings. Expert Syst Appl 167:114152. Available at https://doi.org/10.1016/j.eswa.2020.114152

  8. Merchant K, Pande Y (2018) NLP based latent semantic analysis for legal text summarization. In: 2018 international conference on advances in computing, communications and informatics. ICACCI 2018, pp 1803–1807

    Google Scholar 

  9. Khurana A, Bhatnagar V (2022) Investigating entropy for extractive document summarization. Expert Syst Appl 187

    Google Scholar 

  10. Belwal RC, Rai S, Gupta A (2021) Text summarization using topic-based vector space model and semantic measure. Inf Process Manage 583:102536. https://doi.org/10.1016/j.ipm.2021.102536

  11. Meena SM, Ramkumar MP, Asmitha RE, Emil Selvan GS (2020) Text summarization using text frequency ranking sentence prediction. In: 4th international conference on computer, communication and signal processing. ICCCSP 2020, pp 1–5

    Google Scholar 

  12. Desai MR, Gachhinakatti B, Balaganur P, Rajeshwari Y, Rathod L (2021) Automatic text summarization in natural language processing. In: 2021 IEEE international conference on mobile networks and wireless communications. ICM-NWC 2021, p 4

    Google Scholar 

  13. Zhong M, Liu P, Chen Y, Wang D, Qiu X, Huang X (2020) Extractive summarization as text matching, pp 6197–6208

    Google Scholar 

  14. Satyanarayana Murthy T et al (2022) Effective return rate prediction of blockchain financial products using machine learning. Comput Mater Continua 74(1):2303–2316

    Google Scholar 

  15. Satyanarayana Murthy T (2022) An efficient diabetic prediction system for better diagnosis. Int J Intell Enterprise: 408–421. https://doi.org/10.1504/IJIE.2022.126397

  16. Satyanarayana Murthy, T, Gopalan NP, Athira TR (2020) Hiding critical transactions using modified un-realization approach. Int J Bus Intell 15(3):223–234. https://doi.org/10.1504/IJEB.2020.109071

  17. Teki SM, Varma MK, Yadav AK (2021) A diabetic prediction system based on mean shift clustering. ISI, IIETA Publisher, vol 36, no 2, pp 231–235. https://doi.org/10.18280/isi.260210

  18. Teki SM, Varma MK, Harsha (2021). Brain tumour segmentation using U-net based adversarial networks. Traitement du Signal, IIETA Publisher, vol 36, no 4, pp 353–359. https://doi.org/10.18280/ts.360408

  19. Teki SM, Banothu B, Varma MK (2019) An un-realization algorithm for effective privacy preservation using classification and regression trees. Revue d’Intelligence Artificielle 33(4):313–319. https://doi.org/10.18280/ria.33040

  20. Navaneetha Krishnan S, Sundara Vadivel P, Yuvaraj D, Satyanarayana Murthy T, Malla SJ et al (2022) Enhanced route optimization for wireless networks using meta-heuristic engineering. Comput Syst Sci Eng 43(1):17–26

    Google Scholar 

  21. Shanmuga Priya S, Yuvaraj D, Satyanarayana Murthy T, Chooralil VS, Navaneetha Krishnan S et al (2022) Secure key management based mobile authentication in cloud. Comput Syst Sci Eng 43(3):887–896

    Google Scholar 

  22. Teki SM, Varma MK, Roy S (2020) Improving the performance of association rules hiding using hybrid optimization algorithm. J Appl Secur Res 15(3):423–437. https://doi.org/10.1080/19361610.2020.1756155

  23. Satyanarayana Murthy T, Gopalan NP, Yakobu D (2019) An efficient un-realization algorithm for privacy preserving decision tree learning using McDiarmid’s bound. Int J Innov Technol Exploring Eng (IJITEE) 8(4S2):499–502

    Google Scholar 

  24. Satyanarayana Murthy T, Gopalan NP, Gunturu S (2018) A novel optimization based algorithm to hide sensitive item-sets through sanitization approach. Int J Mod Educ Comput Sci (IJMECS) 10(10):48–55. https://doi.org/10.5815/ijmecs.2018.10.06

  25. Satyanarayana Murthy T, Gopalan NP, Alla DSK (2018) The power of anonymization and sensitive knowledge hiding using sanitization approach. Int J Mod Educ Comput Sci (IJMECS) 10(9), 26–32. https://doi.org/10.5815/ijmecs.2018.09.04

  26. Satyanarayana Murthy T, Gopalan NP (2018) A novel algorithm for association rule hiding. Int J Inf Eng Electron Bus (IJIEEB) 10(3):45–50. https://doi.org/10.5815/ijieeb.2018.03.06

  27. SaiBabu A, Murthy TSN (2012) Security provision in publicly auditable secure cloud data storage services using SHA-1 algorithm. Int J Comput Sci Inf Technol 3(3):4084–4088

    Google Scholar 

  28. Sathyanarayana Murthy T, Mohan Krishna Varma N, Ravuri D, Kishore Babu D, Nazeer S (2022) Classification of precious and non-precious tweets using deep learning. In: Rout RR, Ghosh SK, Jana PK, Tripathy AK, Sahoo JP, Li KC (eds) Advances in distributed computing and machine learning. Lecture notes in networks and systems, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-19-1018-033

  29. Satyanarayana Murthy T, Mohan Krishna Varma N, Roy S, Nazeer S (2022) Effective classification of tweets using machine learning. In: Kumar R, Ahn CW, Sharma TK, Verma OP, Agarwal A (eds) Soft computing: theories and applications. Lecture notes in networks and systems, vol 425. Springer, Singapore. https://doi.org/10.1007/978-981-19-0707-440

  30. Murthy TS, Gopalan NP, Ramachandran V (2019) A Naive Bayes classifier for detecting unusual customer consumption profiles in power distribution systems—APSPDCL. In: 2019 third international conference on inventive systems and control (ICISC). JCT College, Coimbatore, India, pp 673–678

    Google Scholar 

  31. Satyanarayana Murthy T, Preethi G, Gopalan NP (2018) An efficient way of anonymization without subjecting to attacks using secure matrix method. In: Proceedings of the IEEE international conference on intelligent computing and control systems. Vaigai College of Engineering, Madurai, June 2018, pp 1462–1465

    Google Scholar 

  32. Satyanarayana Murthy T, Gopalan NP (2018) An efficient meta-heuristic chemical reaction based algorithm for association rule hiding using an advanced perturbation approach. In: Proceedings of the IEEE international conference on intelligent computing and control systems. Vaigai College of Engineering, Madurai, June 2018. IEEE

    Google Scholar 

  33. Gopalan NP, Satyanarayana Murthy T (2017) Association rule hiding using chemical reaction optimization. Presented a paper at 7th international conference on soft computing for problem solving—SocProS 2017, 23–24 Dec 2017, IIT Bhubaneswar, Orissa

    Google Scholar 

  34. Neelima G, Veeramanickam MRM, Gorbachev S, Kale SA (2019) Extractive text summarization using deep natural language fuzzy processing. Int J Innov Technol Explor Eng 86(4):990–993

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Satyanarayana Murthy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vakkalagaddda, S., Satyanarayana Murthy, T. (2023). Extractive Text Summarization Using Topic Modelling and Entropy. In: Kumar, A., Ghinea, G., Merugu, S. (eds) Proceedings of the 2nd International Conference on Cognitive and Intelligent Computing. ICCIC 2022. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-99-2746-3_35

Download citation

Publish with us

Policies and ethics