Abstract
Nowadays, extracting the desired information from internet source is a challenging task because of a large amount of information available on the internet. So, we propose a new extractive based approach for multi-document text summarization to extract useful information from multi-document. Initially, the redundant contents in the document create a single text file from the multiple text file document. The content coverage and non-redundancy features are achieved by Word Mover Distance (WMD) and Modified Normalized Google Distance (M-NGD) (WM) Hybrid Weight Method based similarity approaches. For feature weight optimization, we use the Dolphin swarm optimization (DSO) which is a metaheuristic approach. The proposed approach is tested under python with multiling 2013 dataset and the performances have been evaluated with ROUGE and AutoSummENG metrics. The investigational outcomes show that the proposed technique works well and very much effective for multi-document text summarization.
Similar content being viewed by others
References
Alguliev RM, Aliguliyev RM, Hajirahimova MS, Mehdiyev CA (2011) MCMR: maximum coverage and minimum redundant text summarization model. Expert Syst Appl 38(12):14514–14522
Alguliev RM, Aliguliyev RM, Isazade NR (2013) Multiple documents summarization based on evolutionary optimization algorithm. Expert Syst Appl 40(5):1675–1689
Al-Sabahi K, Zuping Z, Nadher M (2018) A hierarchical structured self-attentive model for extractive document summarization (HSSAS). IEEE Access 6:24205–24212
Baralis E, Cagliero L, Fiori A, Garza P (2015) Mwi-sum: a multilingual summarizer based on frequent weighted itemsets. ACM Trans Info Syst (TOIS) 34(1):1–35
Conroy JM, Schlesinger JD, Goldstein J, O’leary DP (2004) Left-brain/right-brain multi-document summarization. In: Proceedings of the Document Understanding Conference (DUC 2004)
Conroy JM, Schlesinger JD, Kubina J, Rankel PA, O'Leary DP (2011) CLASSY 2011 at TAC: guided and multi-lingual summaries and evaluation metrics. TAC 11:1–8
Erkan G, Radev DR (2004 Dec 1) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Ferilli S, Pazienza A (2018) An abstract argumentation-based approach to automatic extractive text summarization. In: Italian Research Conference on Digital Libraries, Springer, Cham, pp. 57-68
Giannakopoulos G (2013) Multi-document multilingual summarization and evaluation tracks in acl 2013 multiling workshop. In: Proceedings of the multiling 2013 workshop on multilingual multi-document summarization, pp. 20-28
Giannakopoulos G, Karkaletsis V (2011) AutoSummENG and MeMoG in evaluating guided summaries. In: TAC, pp. 65–70
Gillick D, Favre B, Hakkani-Tür D (2008) The ICSI Summarization System at TAC 2008. In: Tac pp. 335–336.
Goldstein J, Carbonell J (1998) Summarization: using MMR for diversity-based Reranking and evaluating summaries. Carnegie-Mellon Univ Pittsburgh Pa Language Technologies Inst, pp. 59–75
Gross O, Doucet A, Toivonen H (2014) Document summarization based on word associations. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pp. 1023-1026
Hailu TT, Yu J, Fantaye TG (2020) A framework for word embedding based automatic text summarization and evaluation. Information 11(2):78
Khan A, Salim N, Farman H, Khan M, Jan B, Ahmad A, ..., Paul A (2018). Abstractive text summarization based on improved semantic graph approach. Int J Parallel Prog, 46(5), 992–1016.
Larson RR (2010) Introduction to information retrieval. J Am Soc Inf Sci Technol 61(4):852–853
Lin CY, Hovy E (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 150-157
Marujo L, Ling W, Ribeiro R, Gershman A, Carbonell J, de Matos DM, Neto JP (2016) Exploring events and distributed representations of text in multi-document summarization. Knowl-Based Syst 94:33–42
Patel D, Shah S, Chhinkaniwala H (2019 Nov 15) Fuzzy logic based multi document summarization with improved sentence scoring and redundancy removal technique. Expert Syst Appl 134:167–177
Rautray R, Balabantaray RC (2017) Cat swarm optimization based evolutionary framework for multi document summarization. Physica A: Statistic Mechan Appl 477:174–186
Rautray R, Balabantaray RC (2018) An evolutionary framework for multi document summarization using cuckoo search approach: MDSCSA. Appl Comput Inform 14(2):134–144
Rezaei A, Dami S, Daneshjoo P (2019) Multi-document extractive text summarization via deep learning approach. In: 2019 5th Conference on Knowledge Based Engineering and Innovation (KBEI). IEEE, (pp. 680-685)
Roul RK, Mehrotra S, Pungaliya Y, Sahoo JK (2019) A new automatic multi-document text summarization using topic modeling. In: International conference on distributed computing and internet technology. Springer, Cham, 11319:212–221
Sahba R, Ebadi N, Jamshidi M, Rad P (2018) Automatic text summarization using customizable fuzzy features and attention on the context and vocabulary. In: 2018 World Automation Congress (WAC), IEEE, pp. 1-5
Sanchez-Gomez JM, Vega-Rodríguez MA, Pérez CJ (2018) Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach. Knowl-Based Syst 159:1–8
Sanchez-Gomez JM, Vega-Rodríguez MA, Pérez CJ (2019) Parallelizing a multi-objective optimization approach for extractive multi-document text summarization. J Parall Distrib Comput 134:166–179
Sanchez-Gomez JM, Vega-Rodríguez MA, Perez CJ (2020) Experimental analysis of multiple criteria for extractive multi-document text summarization. Expert Syst Appl 140:112904
Sanchez-Gomez JM, Vega-Rodríguez MA, Pérez CJ (2020) A decomposition-based multi-objective optimization approach for extractive multi-document text summarization. Applied Soft Comput 91:106231
Song S, Huang H, Ruan T (2019) Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78(1):857–875
Tohalino JV, Amancio DR (2018) Extractive multi-document summarization using multilayer networks. Physica A: Statistic Mechan Appl 503:526–539
Toman M, Tesar R, Jezek K (2006) Influence of word normalization on text classification. Proc InSciT 4:354–358
Uçkan T, Karcı A (2020) Extractive multi-document text summarization based on graph independent sets. Egyptian Inform J 21(3):145–157
Valladares-Valdés E, Simón-Cuevas A, Olivas JA, Romero FP (2019) A fuzzy approach for sentences relevance assessment in multi-document summarization. In: International Workshop on Soft Computing Models in Industrial and Environmental Applications. Springer, Cham, 950:57–67
Verma P, Om H (2019) MCRMR: maximum coverage and relevancy with minimal redundancy based multi-document summarization. Expert Syst Appl 120:43–56
Verma P, Om H (2019) A novel approach for text summarization using optimal combination of sentence scoring methods. Sādhanā 44(5):110
William HD (2004) The principles of readability. ERIC. Online Submission
Yao K, Zhang L, Luo T, Wu Y (2018) Deep reinforcement learning for extractive document summarization. Neurocomputing 284:52–62
Zamanian M, Heydari P (2012) Readability of texts: state of the art. Theory Pract Language Stud 2(1):43–53.
Zhan ZH, Zhang J, Li Y, Chung HSH (2009) Adaptive particle swarm optimization. IEEE Trans Syst Man Cybernet Part B (Cybernetics) 39(6):1362–1381
Author information
Authors and Affiliations
Contributions
All the authors have participated in writing the manuscript and have revised the final version. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Authors declares that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants and/or animals performed by any of the authors.
Informed consent
There is no informed consent for this study.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Srivastava, A.K., Pandey, D. & Agarwal, A. Extractive multi-document text summarization using dolphin swarm optimization approach. Multimed Tools Appl 80, 11273–11290 (2021). https://doi.org/10.1007/s11042-020-10176-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10176-1