An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms
- 392 Downloads
Many researches have been converging on automatic text summarization as increasing of text documents due to the expansion of information diffusion constantly. The objective of this proposal is to achieve the most reliable and substantial context or most relevant brief summary of the text in extractive manner. The extractive text summarization produces the short summary of a certain text which contains the most important information of original text by extracting the set of sentences from the original document. This paper proposes an improved extractive text summarization method for documents by enhancing the conventional lexical chain method to produce better relevant information of the text using three distinct features or characteristics of keyword in a text. The keyword of the document is labeled using our previous work, transition probability distribution generator model which can learn the characteristics of the keyword in a document, and generates their probability distribution upon each feature.
KeywordsAutomatic text summarization Keyword extraction Lexical chain Markov chain WordNet Semantic-related terms Web contents Machine learning
This study was supported by research Fund from Chosun University, 2015.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- Annapurna P. Patil SD, Syed AAA, Tanay A, Varun B (2014) Automatic text summarizer. In: Proceedings of 2014 international conference on advances in computing, communications and informatics ICACCI, pp 1530–1534Google Scholar
- Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: Proceedings of the 35th annual meeting of the association for computational linguistics and the 8th European chapter meeting of the association for computational linguistics, workshop on intelligent scalable text summarization, pp 10–17Google Scholar
- Cohen JD (1999) Highlights: language and domain-independent automatic indexing terms for abstracting. J Am Soc Inf Sci 46(3):162–174Google Scholar
- Dipanjan D, Martins AFT (2007) A survey on automatic text summarization. Technical Report 8Google Scholar
- Halliday M, Hasan R (1976) Cohesion in English. Longman, LondonGoogle Scholar
- Harabagiu S, Moldovan D (1998) WordNet: an electronic lexical database. Chapter knowledge processing on an extended wordnet. MIT press, CambridgeGoogle Scholar
- Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on emprical methods in natural language processing EMNLP ’03. Association for Computational Linguistics, pp 216–223Google Scholar
- Karen SJ (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21. doi: 10.1108/eb026526
- Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th annual international conference on systems documentation, ACM Press, pp 24–26Google Scholar
- Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on text summarization branches out WAS2004, pp 74–81Google Scholar
- Lynn HM, Choi C, Choi JH, Shin J, Pankoo K (2016) The method of semi-supervised automatic keyword extraction for web documents using transition probability distribution generator. In: Proceedings of the international conference on research in adaptive and convergent systems RACS ’16, pp 1–6. doi: 10.1145/2987386.2987399
- Mani I, Maybury M (1999) Advances in automatic text summarization. Comput Linguist 26(2):280–281Google Scholar
- Martin D, Karel J (2011) Automatic keyphrase extraction based on NLP and statistical methods. In: Proceedings of the Dateso 2011: annual international workshop on databases, texts, specifications and objects, CEUR workshop proceedings 706:140–145Google Scholar
- Michael JG (2005) A comparative analysis of keyword extraction techniques. The State University of New Jersey, RutgersGoogle Scholar
- Morris J, Hirst G (1991) Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput Linguist 17(l):21–48Google Scholar
- Rada M, Paul T (2004) TextRank: bringing order into texts. In: Proceedings of the conference on empirical methods in natural language processing EMNLP ’04. Association for Computational Linguistics, pp 404–411Google Scholar
- Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. In: Berry MW, Kogan J (eds) Text mining: theory and applications. John Wiley, Chichester, UK. doi: 10.1002/9780470689646.ch1
- Zhang K, Xu H, Tang J, Li JZ (2006) Keyword extraction using support vector machine. In: Proceedings of the 7th international conference on web-age information management WAIM ’06. pp 85–96. doi: 10.1007/11775300_8