Extraction Based Automatic Text Summarization System with HMM Tagger

  • Suneetha Manne
  • Zaheer Parvez Shaik Mohd.
  • S. Sameen Fatima
Conference paper
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 132)


A rough estimation of world’s famous search engine Google in year 2010 revealed that the total size of internet has now turned to 2 petabytes. The increase in the performance and fast accessing of web resources has made a new challenge of browsing among huge data on internet. It is hence browsing on web is an under laid topic for researchers. The research on web has turned its steps towards Browsing among Information (BAI) rather than Browsing for Information (BFI).The field of Information Extraction (IE) is offering a huge scope to concise and compact the information enabling the user to decide by mere check at snippets of each link. Automatic text summarization is the process of condensing the source text into a shorter version preserving its information content and overall meaning. In this paper, we propose a frequent term based text summarization technique based on the analysis of Parts of Speech for generating effective and efficient summary.


Hide Markov Model Sentence Length Feature Term Text Summarization Stop Word Removal 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal, 159–165 (April 1958)Google Scholar
  2. 2.
    Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the Association for Computing Machinery 16(2), 264–285 (1969)zbMATHGoogle Scholar
  3. 3.
    Pollock, J.J., Zamora, A.: Automatic Abstracting Research at Chemical Abstracts Service. Journal of Chemical Information and Computer Sciences 15(4), 226–232 (1975)CrossRefGoogle Scholar
  4. 4.
  5. 5.
    McKeown, K.R.: Discourse Strategies for Generating Natural Language Text. Department of Computer Science, Columbia University, New York (1982)Google Scholar
  6. 6.
    Brandow, R., Mitze, K., Rau, L.F.: Automatic condensation of electronic publications by sentence selection. Information Processing Management 31(5), 675–685 (1995)CrossRefGoogle Scholar
  7. 7.
    Barzilay, R., Elhadad, M., Boguraev, Kennedy, M.: Using Lexical Chains for Text Summarization. In: Workshop on Intelligent Scalable Text Summarization, Ben Gurion University of the Negev, Be’er Sheva (1997)Google Scholar
  8. 8.
    Radev, R., Blair-goldensohn, S., Zhang, Z.: Experiments in Single and Multi-Docuemtn Summarization using MEAD. In: First Document Understanding Conference, New Orleans, LA (2001) Google Scholar
  9. 9.
    Karthik Kumar, G., Sudheer, K., Avinesh, P.V.S.: Comparative Study of Various Machine Learning Methods for Telugu Part of Speech Tagging. In: Proceeding of the NLPAI Machine Learning Competition (2006)Google Scholar
  10. 10.
    Bahl, L., Mercer, R.L.: Part-Of-Speech assignment by a statistical decision algorithm. In: IEEE International Symposium on Information Theory, pp. 88–89 (1976)Google Scholar
  11. 11.
    Gupta, V., Lehal, G.S.: A Survey of Text Summarization Extractive Techniques. Journal of Emerging Technologies In Web Intelligence 2(3) (August 2010)Google Scholar
  12. 12.
    Radev, D.R., Hovy, E., McKeown, K.: Introduction to the special issue on summarization. Computational Linguistics 28(4), 399–408 (2002)CrossRefGoogle Scholar
  13. 13.
    Nahm, U.Y., Mooney, R.J.: Text mining with information extraction. In: AAAI 2002, Spring Symposium on Mining Answers from Texts and Knowledge Bases (2002)Google Scholar
  14. 14.
    Nou, C.: Khmer Part-of-Speech Tagging. Global Information and Telecommunication Studies. Waseda UniversityGoogle Scholar
  15. 15.
    Suneetha, M., Sameen Fatima, S.: Corpus based Automatic Text Summarization System with HMM Tagger. International Journal of Soft Computing and Engineering (IJSCE) 1(3), 118–123 (2011) ISSN: 2231-2307Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Suneetha Manne
    • 1
  • Zaheer Parvez Shaik Mohd.
    • 1
  • S. Sameen Fatima
    • 2
  1. 1.Department of ITVRSECVijayawadaIndia
  2. 2.Department of CSEOsmania UniversityHyderabadIndia

Personalised recommendations