Extraction Based Automatic Text Summarization System with HMM Tagger
A rough estimation of world’s famous search engine Google in year 2010 revealed that the total size of internet has now turned to 2 petabytes. The increase in the performance and fast accessing of web resources has made a new challenge of browsing among huge data on internet. It is hence browsing on web is an under laid topic for researchers. The research on web has turned its steps towards Browsing among Information (BAI) rather than Browsing for Information (BFI).The field of Information Extraction (IE) is offering a huge scope to concise and compact the information enabling the user to decide by mere check at snippets of each link. Automatic text summarization is the process of condensing the source text into a shorter version preserving its information content and overall meaning. In this paper, we propose a frequent term based text summarization technique based on the analysis of Parts of Speech for generating effective and efficient summary.
KeywordsHide Markov Model Sentence Length Feature Term Text Summarization Stop Word Removal
Unable to display preview. Download preview PDF.
- 1.Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal, 159–165 (April 1958)Google Scholar
- 4.Brown Tagset, http://www.scs.leeds.ac.uk/amalgam/tagsets/brown.html
- 5.McKeown, K.R.: Discourse Strategies for Generating Natural Language Text. Department of Computer Science, Columbia University, New York (1982)Google Scholar
- 7.Barzilay, R., Elhadad, M., Boguraev, Kennedy, M.: Using Lexical Chains for Text Summarization. In: Workshop on Intelligent Scalable Text Summarization, Ben Gurion University of the Negev, Be’er Sheva (1997)Google Scholar
- 8.Radev, R., Blair-goldensohn, S., Zhang, Z.: Experiments in Single and Multi-Docuemtn Summarization using MEAD. In: First Document Understanding Conference, New Orleans, LA (2001) Google Scholar
- 9.Karthik Kumar, G., Sudheer, K., Avinesh, P.V.S.: Comparative Study of Various Machine Learning Methods for Telugu Part of Speech Tagging. In: Proceeding of the NLPAI Machine Learning Competition (2006)Google Scholar
- 10.Bahl, L., Mercer, R.L.: Part-Of-Speech assignment by a statistical decision algorithm. In: IEEE International Symposium on Information Theory, pp. 88–89 (1976)Google Scholar
- 11.Gupta, V., Lehal, G.S.: A Survey of Text Summarization Extractive Techniques. Journal of Emerging Technologies In Web Intelligence 2(3) (August 2010)Google Scholar
- 13.Nahm, U.Y., Mooney, R.J.: Text mining with information extraction. In: AAAI 2002, Spring Symposium on Mining Answers from Texts and Knowledge Bases (2002)Google Scholar
- 14.Nou, C.: Khmer Part-of-Speech Tagging. Global Information and Telecommunication Studies. Waseda UniversityGoogle Scholar
- 15.Suneetha, M., Sameen Fatima, S.: Corpus based Automatic Text Summarization System with HMM Tagger. International Journal of Soft Computing and Engineering (IJSCE) 1(3), 118–123 (2011) ISSN: 2231-2307Google Scholar