Abstract
Multi-document summarization has been used for extracting the most relevant sentences from a set of documents, allowing the user to more quickly address the content thereof. This paper addresses the generation of extractive summaries from multiple documents as a binary optimization problem and proposes a method, based on CHC evolutionary algorithm and greedy search, called MA-MultiSumm, in which objective function optimizes the lineal combination of coverage and redundancy factors. MA-MultiSumm was compared with other state-of-the-art methods using ROUGE measures. The results showed that MA-MultiSumm outperforms all methods on the DUC2005 dataset; and on DUC2006 the results are very close to the best method. Furthermore in a unified ranking MA-MultiSumm only was improved on by the DESAMC+DocSum method, which requires as many iterations of the evolutionary process as MA-MultiSumm. The experimental results show that the optimization-based approach for multiple document summarization is truly a promising research direction.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artificial Intelligence Review 37(1), 1–41 (2012)
Nenkova, A., McKeown, K.: A Survey of Text Summarization Techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, US (2012)
Miranda, S., Gelbukh, A., Sidorov, G.: Generación de resúmenes por medio de síntesis de grafos conceptuales. Revista Signos. Estudios de Lingüística 47(86) (2014)
Amini, M.-R., Usunier, N.: Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of 32nd Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, USA, pp. 704–705. ACM (2009)
Ouyang, Y., et al.: Applying regression models to query-focused multi-document summarization. Information Processing & Management 47(2), 227–237 (2011)
Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, pp. 1937–1942. IEEE (1937)
Atkinson, J., Munoz, R.: Rhetorics-based multi-document summarization. Expert Systems with Applications 40(11), 4346–4352 (2013)
Otterbacher, J., Erkan, G., Radev, D.R.: Biased LexRank: passage retrieval using random walks with question-based priors. Information Processing and Management 45(1), 42–54 (2009)
Wei, F., et al.: Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 283–290. ACM (2008)
Radev, D.R., et al.: Centroid-based summarization of multiple documents. Information Processing & Management 40(6), 919–938 (2004)
Steinberger, J., Křišťan, M.: LSA-Based Multi-Document Summarization. In: Proceedings of 8th International PhD Workshop on Systems and Control, Balatonfured, Hungary (2007)
Sun, P., ByungRae, C.: Query-Based Multi-Document Summarization Using Non-Negative Semantic Feature and NMF Clustering. In: Proceedings Fourth International Conference on Networked Computing and Advanced Information Management, NCM, Gyeongju, pp. 609–614. IEEE (2008)
Hennig, L.: Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis. In: Proceedings International Conference RANLP, Borovets, Bulgaria, pp. 144–149 (2009)
Mei, J.-P., Chen, L.: SumCR: a new subtopic-based extractive approach for text summarization. Knowledge and Information Systems 31(3), 527–545 (2012)
Alguliev, R.M., et al.: MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 14514–14522 (2011)
Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowledge-Based Systems 36(0), 21–38 (2012)
Abuobieda, A., Salim, N., Kumar, Y.J., Osman, A.H.: An Improved Evolutionary Algorithm for Extractive Text Summarization. In: Selamat, A., Nguyen, N.T., Haron, H., et al. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 78–89. Springer, Heidelberg (2013)
Mendoza, M., et al.: Extractive single-document summarization based on genetic operators and guided local search. Expert Systems with Applications 41(9), 4158–4169 (2014)
Neri, F., Cotta, C.: Memetic algorithms and memetic computing optimization: A literature review. Swarm and Evolutionary Computation 2(0), 1–14 (2012)
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Hachey, B., Murray, G., Reitter, D.: The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space. In: Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada (2005)
Silla, C.N., Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization with genetic algorithm-based attribute selection. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 305–314. Springer, Heidelberg (2004)
Ochoa, G., Verel, S., Tomassini, M.: First-improvement vs. Best-improvement local optima networks of NK landscapes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 104–113. Springer, Heidelberg (2010)
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 Workshop on Text Summarization Branches Out, Barcelona, Spain (2004)
Alguliev, R.M., Aliguliyev, R.M., Mehdiyev, C.A.: Sentence selection for generic document summarization using an adaptive differential evolution algorithm. Swarm and Evolutionary Computation 1(4), 213–222 (2011)
Celikyilmaz, A., Hakkani-Tur, D.: A Hybrid Hierarchical Model for Multi-Document Summarization. In: Proceedings 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 815–824. Association for Computational Linguistics (2010)
Lei, H., et al.: Modeling Document Summarization as Multi-objective Optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386. IEEE (2010)
Wei, F., Li, W., Liu, S.: iRANK: a rank-learn-combine framework for unsupervised ensemble ranking. American Society for Information Science and Technology 61(6), 1232–1243 (2010)
Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado, pp. 362–370. Association for Computational Linguistics (2009)
Wang, D., et al.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 307–314 (2008)
Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of the Ninth SIAM International Conference on Data Mining, Nevada, USA, pp. 1148–1159 (2009)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and development in Information Retrieval, Melbourne, Australia, pp. 335–336. ACM (1998)
Eiben, A.E., Smit, S.K.: Evolutionary Algorithm Parameters and Methods to Tune Them. In: Hamadi, Y., Monfroy, E., Saubion, F. (eds.) Autonomous Search, pp. 15–36. Springer, Heidelberg (2012)
Cobos, C., Estupiñán, D., Pérez, J.: GHS + LEM: Global-best Harmony Search using learnable evolution models. Applied Mathematics and Computation 218(6), 2558–2578 (2011)
Sidorov, G., et al.: Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas 18(3) (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Mendoza, M., Cobos, C., León, E., Lozano, M., Rodríguez, F., Herrera-Viedma, E. (2014). A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-13647-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)