A Comparative Study of Classifiers for Extractive Text Summarization

  • Anshuman PattanaikEmail author
  • Sanjeevani Subhadra Mishra
  • Madhabananda Das
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1101)


Automatic text summarization (ATS) is a widely used approach. Through the years, various techniques have been implemented to produce the summary. An extractive summary is a traditional mechanism for information extraction, where important sentences are selected which refers to the basic concepts of the article. In this paper, extractive summarization has been considered as a classification problem. Machine learning techniques have been implemented for classification problems in various domains. To solve the summarization problem in this paper, machine learning is taken into consideration, and KNN, random forest, support vector machine, multilayer perceptron, decision tree and logistic regression algorithm have been implemented on Newsroom dataset.


Text summarization Extractive Sentence scoring Machine learning 


  1. 1.
    Luhn, H.P. 1958. The automatic creation of literature abstracts. IBM Journal of Research and Development 2 (2): 159–165.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Gambhir, M., and V. Gupta. 2017. Recent automatic text summarization techniques: A survey. Artificial Intelligence Review 47 (1): 1–66.CrossRefGoogle Scholar
  3. 3.
    Meena, Y.K., and D. Gopalani. 2014. Analysis of sentence scoring methods for extractive automatic text summarization. In Proceedings of the 2014 international conference on information and communication technology for competitive strategies, November 2014, 53. ACM.Google Scholar
  4. 4.
    Pattanaik, A., S. Sagnika, M. Das, and B.S.P. Mishra. 2019. Extractive summary: An optimization approach using bat algorithm. Ambient communications and computer systems, 175–186. Singapore: Springer.CrossRefGoogle Scholar
  5. 5.
    Joachims, T. 1998. Text categorization with support vector machines: Learning with many relevant features. In European conference on machine learning, April 1998, 137–142. Springer, Berlin, Heidelberg.Google Scholar
  6. 6.
    Nobata, C., S. Sekine, M. Murata, K. Uchimoto, M. Utiyama, H., and Isahara. 2001. Sentence extraction system assembling multiple evidence. In NTCIR.Google Scholar
  7. 7.
    Jafari, M., J. Wang, Y. Qin, M. Gheisari, A.S. Shahabi, and X. Tao. 2016. Automatic text summarization using fuzzy inference. In 22nd International conference on automation and computing (ICAC), September 2016, 256–260. IEEE.Google Scholar
  8. 8.
    Matsuo, Y., and M. Ishizuka. 2004. Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools 13 (01): 157–169.CrossRefGoogle Scholar
  9. 9.
    NewsRoom Dataset Available (2017) Cornell Newsroom. 2017.
  10. 10.
    Powers, D.M. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation.Google Scholar
  11. 11.
    Davis, J., and M. Goadrich. 2006. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd international conference on machine learning, June 2006, 233–240. ACM.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Anshuman Pattanaik
    • 1
    Email author
  • Sanjeevani Subhadra Mishra
    • 1
  • Madhabananda Das
    • 1
  1. 1.School of Computer EngineeringKalinga Institute of Industrial Technology (Deemed-to-be University)BhubaneswarIndia

Personalised recommendations