Abstract
Nowadays, efficient access of information from the text documents with high-degree of semantic information has become more difficult due to diversity of vocabulary and rapid growth of the Internet. Traditional text clustering algorithms are widely used to organize a large text document into smaller manageable groups of sentences, but it does not consider the semantic relationship among the words present in the document. Lexical chains try to identify cohesion links between words by identifying their semantic relationship. They try to link words in a document that are thought to be describing the same concept to gather information. This method of text summarization helps to process the linguistic features of the document which is otherwise ignored in statistical summarization approaches. In this paper, we have proposed a text summarization technique by constructing lexical chains and defining a coherence metric to select the summary sentences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nenkova, A., McKeown, K.: A Survey of Text Summarization Techniques. Springer Science+Business Media (2012)
Dalal, V., Malik, L.: A survey of extractive and abstractive text summarization techniques. In: 2013 6th International Conference on Emerging Trends in Engineering and Technology, pp. 109–110, Dec 2013
Agrawal, N., Sharma, S., Sinha, P., Bagai, S.: A graph based ranking strategy for automated text summarization. DU J. Undergrad. Res. Innov. 1(1) (2015)
Landauer, T.K., Foltz, W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)
Saggion, H., Lapalme, G.: Generating indicative-informative summaries with sumUM. Comput. Linguist. 28(4), 497–526 (2002)
Lin, C.-Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: COLING ’00 Proceedings of the 18th Conference on Computational Linguistics, pp. 495–501. Association for Computational Linguistics, Stroudsburg, PA, USA 2000 (2000)
Seki, Y.: Sentence extraction by TF/IDF and position weighting from newspaper articles (2002)
Radev, D.R., Jing, H., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40, 919–938 (2003)
Zhao, L., Lide, W., Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization. Inf. Process. Manag. 45(1), 35–41 (2009)
Wei, T., Yonghe, L., Chang, H., Zhou, Q., Bao, X.: A semantic approach for text clustering using wordnet and lexical chains. Expert Syst. Appl. 42(4), 2264–2275 (2015)
Ghose, A.: Supervised lexical chaining. Master’s thesis, Indian Institute Of Technology, Madras (2011)
Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 20. Association for Computational Linguistics (2004)
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 457–479 (2004)
Dutta, S., Ghatak, S., Roy, M., Ghosh, S., Das, A.K.: A graph based clustering technique for tweet summarization. In 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), pp. 1–6. IEEE (2015)
Barzilay, R., Lee, L.: Catching the drift: probabilistic content models, with applications to generation and summarization, vol. 34, pp. 1–34, Mar 2008
Kaikhah, K.: Automatic text summarization with neural networks. In: 2004 2nd International IEEE Conference on Intelligent Systems, 2004. Proceedings, vol. 1, pp. 40–44, June 2004
Daumé III, H.: Bayesian query-focused summarization. CoRR arxiv:abs/0907.1814 (2009)
Nenkova, A., Maskey, S., Liu, Y.: Automatic summarization. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011, HLT ’11, pp. 3:1–3:86. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Wordnet. http://wordnet.princeton.edu/. Accessed 30 Dec 2017
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Third IEEE International Conference on Data Mining, 2003. ICDM 2003, pp. 541–544. IEEE (2003)
Sedding, J., Kazakov, D.: Wordnet-based text document clustering. In: Proceedings of the 3rd Workshop on Robust Methods in Analysis of Natural Language Data, pp. 104–113. Association for Computational Linguistics (2004)
Jain, A., Gaur, A.: Summarizing long historical documents using significance and utility calculation using wordnet. Imp. J. Interdiscip. Res. 3(3) (2017)
Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput. Linguist. 17(1), 21–48 (1991)
Beautifulsoup documentation. https://www.crummy.com/software/BeautifulSoup/bs4/doc/. Accessed 29 Nov 2017
Python 2.7.14 documentation. https://docs.python.org/2/index.html. Accessed 29 Nov 2017
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly (2009)
Penn treebank pos tags. https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html. Accessed 30 Dec 2017
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL Workshop: Text Summarization Braches Out 2004, p. 10, 01 (2004)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mallick, C., Dutta, M., Das, A.K., Sarkar, A., Das, A.K. (2019). Extractive Summarization of a Document Using Lexical Chains. In: Nayak, J., Abraham, A., Krishna, B., Chandra Sekhar, G., Das, A. (eds) Soft Computing in Data Analytics . Advances in Intelligent Systems and Computing, vol 758. Springer, Singapore. https://doi.org/10.1007/978-981-13-0514-6_78
Download citation
DOI: https://doi.org/10.1007/978-981-13-0514-6_78
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0513-9
Online ISBN: 978-981-13-0514-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)