Skip to main content

An Automatic Text Summarization on Naive Bayes Classifier Using Latent Semantic Analysis

  • Chapter
  • First Online:
Data, Engineering and Applications

Abstract

Currently, huge information is available on Internet, but it is difficult to find the relevant information at a fast and efficient rate. Large collection of textual data is available on the Internet. A very competent system is required to find the most appropriate information from the corpus. Automatic text summarization converts a large document into a shorter precise version. It selects the significant part of the text and builds a comprehensive summary that represents the main content of the given document. Text summarization extracts sentences based on the calculation of the score and rank from the document. In this paper, the model that we have developed uses latent semantic analysis technique and chooses sentences based on a specific threshold given by the system. Further, using Naïve Bayes approach of machine learning, the model trains the classifier and predicts the summary that is built on the basis of calculation of singular-value decomposition (SVD). Before training the model, it selects two important concepts of SVD—feature ranking and recursive feature elimination. This paper focuses on extractive text summarization using machine learning, statistical techniques, and latent semantic analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Meena, Y.K., Jain, A., Gopalani, D.: Survey on graph and cluster based approaches in multi-document text summarization. In: Recent Advances and Innovations in Engineering (ICRAIE), 2014, Jaipur, pp. 1–5 (2014). https://doi.org/10.1109/icraie.2014.6909126

  2. Gholamrezazadeh, S., Salehi, M.A., Gholamzadeh, B.: A comprehensive survey on text summarization systems. In: 2009 2nd International Conference on Computer Science and its Applications, Jeju, Korea (South), pp. 1–6 (2009). https://doi.org/10.1109/csa.2009.5404226

  3. Babar, S.A., Patil, P.D.: Improving performance of text summarization. Procedia Comput. Sci. 46, 354–363 (2015). ISSN 1877-0509, http://dx.doi.org/10.1016/j.procs.2015.02.031

  4. Ozsoy, M.G., Alpaslan, F.N., Cicekli, I.: Text summarization using latent semantic analysis. J. Inf. Sci. (2011). [online] Available at: http://journals.sagepub.com/doi/10.1177/0165551511408848

  5. Analytics Vidhya: 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) (2017). [online] Available at: https://www.analyticsvidhya.com/blog/2015/09/naive-bayes-explained/. Accessed 23 June 2017

  6. Radev, D.R., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Çelebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale multi-document summarization: the mead project. In Proceedings of ACL, Sapporo, Japan (2003)

    Google Scholar 

  7. Ali, M., Ghosh, M.K., Abdullah-Al-Mamun: Multi-document text summarization: SimWithFirst based features and sentence co-selection based evaluation. In: 2009 International Conference on Future Computer and Communication, Kuala Lumpur, 2009, pp. 93–96 (2009). https://doi.org/10.1109/icfcc.2009.42

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chintan Shah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Shah, C., Jivani, A. (2019). An Automatic Text Summarization on Naive Bayes Classifier Using Latent Semantic Analysis. In: Shukla, R.K., Agrawal, J., Sharma, S., Singh Tomer, G. (eds) Data, Engineering and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-13-6347-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6347-4_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6346-7

  • Online ISBN: 978-981-13-6347-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics