Abstract
The applications of matrix factorization are an important tool for text summarization. In last years, several variations of the non-negative matrix factorization (NMF) methods have found their usage in multi-document summarization (MDS). For matrix factorization to work efficiently in MDS, it is essential to show the ability of selecting the most typical data points from the given data space. In the chapter, we first describe the archetypal analysis (AA) and its weighted version and then we present the AA-based document summarization method for the two most known summarization tasks, namely the general and the query-focused MDS. Archetypal analysis, also known as the convex NMF, in contrast to other NMF methods selects distinct (archetypal) sentences and therefore leads to variability and diversity in content of the generated summaries. We conducted experiments on the data of document understanding conference. Experimental results evidence the improvement of the proposed approach over other closely related methods including ones using the NMF.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
E. Canhasi, I. Kononenko, Multi-document summarization via archetypal analysis of the content-graph joint model. Knowl. Inf. Syst. 41(3), 821–842 (2014)
E. Canhasi, I. Kononenko, Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization. Expert Syst. Appl. 41(2), 535–543 (2014)
J. Steinberger, K. Ježek, Text summarization and singular value decomposition, Advances in Information Systems (Springer, Berlin, 2005), pp. 245–254
C.B. Lee, M.S. Kim, H.R. Park, Automatic summarization based on principal component analysis, Progress in Artificial Intelligence (Springer, Berlin, 2003), pp. 409–413
J. Yeh, Text summarization using a trainable summarizer and latent semantic analysis. Inf. Process. Manag. 41(1), 75–95 (2005)
J.-H. Lee, S. Park, C.M. Ahn, D. Kim, Automatic generic document summarization based on non-negative matrix factorization. Info. Process. Manag. 45(1), 20–34 (2009)
D. Wang, T. Li, S. Zhu, C. Ding, Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization, in Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM, 2008), pp. 307–314
L. Hennig, D. Labor, Topic-based multi-document summarization with probabilistic latent semantic analysis. Recent Advances in Natural Language Processing (RANLP) (2009)
Y. Ledeneva, R.G. Hernández, R.M. Soto, R.C. Reyes, A. Gelbukh, Em clustering algorithm for automatic text summarization, Advances in Artificial Intelligence (Springer, Berlin, 2011), pp. 305–315
G. Erkan, D.R. Radev, Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22, 457–479 (2004)
R. Arora, B. Ravindran, Latent dirichlet allocation and singular value decomposition based multi-document summarization, in: Eighth IEEE International Conference on Data Mining, ICDM’08 (2008), pp. 713–718
S. Park, J.-H. Lee, C.-M. Ahn, J.S. Hong, S.-J. Chun, Query based summarization using non-negative matrix factorization, Knowledge-Based Intelligent Information and Engineering Systems (Springer, Berlin, 2006), pp. 84–89
J. Otterbacher, G. Erkan, D.R. Radev, Biased lexrank: passage retrieval using random walks with question-based priors. Inf. Process. Manag. 45(1), 42–54 (2009)
C. Bauckhage, C. Thurau, Making archetypal analysis practical, Pattern Recognition (Springer, Berlin, 2009), pp. 272–281
M. Mørup, L.K. Hansen, Archetypal analysis for machine learning and data mining. Neurocomputing 80, 54–63 (2012)
A. Cutler, L. Breiman, Archetypal analysis. Technometrics 36(4), 338–347 (1994)
M.J. Eugster, F. Leisch, Weighted and robust archetypal analysis. Comput. Stat. Data Anal. 55(3), 1215–1225 (2011)
P. Pentti, T. Unto, Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Env. Wiley Online Libr. 5(2), 111–126 (1994)
C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, in Text Summarization Branches Out: Proceedings of the ACL-04 Workshop (2004), pp. 74–81
A. Khan, N. Salim, Y.J. Kumar, A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30, 737–747 (2015)
E. Canhasi, I. Kononenko. Semantic role frames graph-based multidocument summarization, in Proceedings SiKDD’11 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Canhasi, E., Kononenko, I. (2016). Automatic Extractive Multi-document Summarization Based on Archetypal Analysis. In: Naik, G. (eds) Non-negative Matrix Factorization Techniques. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48331-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-48331-2_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48330-5
Online ISBN: 978-3-662-48331-2
eBook Packages: EngineeringEngineering (R0)