Abstract
Features in a text are hierarchically structured and may not be optimally learned using one-step encoding. Scrutinizing the literature several times facilitates a better understanding of content and helps frame faithful context representations. The proposed model encapsulates the idea of re-examining a piece of text multiple times to grasp the underlying theme and aspects of English grammar before formulating a summary. We suggest a multi-level shared-weight encoder (MSE) that exclusively focuses on the sentence summarization task. MSE exercises a weight-sharing mechanism for proficiently regulating the multi-level encoding process. Weight-sharing helps recognize patterns left undiscovered by single level encoding strategy. We perform experiments with six encoding levels with weight sharing on the renowned short sentence summarization Gigaword and DUC2004 Task1 datasets. The experiments show that MSE generates a more readable(fluent) summary (Rouge-L score) as compared to multiple benchmark models while preserving similar levels of informativeness (Rouge-1 and Rouge-2 scores). Moreover, human evaluation of the generated abstracts also corroborates these assertions of enhanced readability.
Similar content being viewed by others
References
Amplayo RK, Lim S, Hwang SW (2018) Entity commonsense representation for neural abstractive summarization. In: 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018, Association for Computational Linguistics (ACL), pp 697–707
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473
Cao Z, Wei F, Li W, Li S (2017) Faithful to the original: fact aware neural abstractive summarization. arXiv preprint arXiv:171104434
Cao Z, Li W, Li S, Wei F (2018) Retrieve, rerank and rewrite: Soft template based neural summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 152–161
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 484–494
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 93–98
Cohan A, Dernoncourt F, Kim DS, Bui T, Kim S, Chang W, Goharian N (2018) A discourse-aware attention model for abstractive summarization of long documents. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp 615–621
Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) Crhasum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Appl 1–13
Gao Y, Wang Y, Liu L, Guo Y, Huang H (2019) Neural abstractive summarization fusing by global generative topics. Neural Comput Appl 1–10
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp 1243–1252
Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4098–4109
Goodfellow I, Bengio Y, Courville A (2016) Sequence modeling: recurrent and recursive nets. Deep Learning. pp 367–415
Graff D, Kong J, Chen K, Maeda K (2003) English gigaword. Linguistic Data Consortium, Philadelphia 4(1):34
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: LongPapers), pp 687–697
Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient DNNS. In: Advances in neural information processing systems, pp 1379–1387
Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2(3):258–268
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149
Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M (2016) Controlling output length in neural encoder-decoders. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 1328–1338
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations, pp 67–72
Lamb AM, Goyal AGAP, Zhang Y, Zhang S, Courville AC, Bengio Y (2016) Professor forcing: A new algorithm for training recurrent networks. In: Advances in neural information processing systems, pp 4601–4609
Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4131–4141
LeCun Y et al (1989) Generalization and network design strategies. Connect Perspect 19:143–155
Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1430–1441
Li P, Lam W, Bing L, Wang Z (2017) Deep recurrent generative decoder for abstractive text summarization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 2091–2100
Lin CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Lin J, Sun X, Ma S, Su Q (2018) Global encoding for abstractive summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 163–169
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1412–1421
Nallapati R, Zhou B, Gulcehre C, Xiang B et al (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pp 280–290
Napoles C, Gormley MR, Van Durme B (2012) Annotated gigaword. In: Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX), pp 95–100
Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Mining text data, Springer, pp 43–76
Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neural Comput 4(4):473–493
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: International Conference on Learning Representations
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp 4095–4104
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:170605098
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 379–389
Sak H, Shannon M, Rao K, Beaufays F (2017) Recurrent neural aligner: an encoder-decoder neural network model for sequence to sequence mapping. Interspeech 8:1298–1302
Savarese P, Maire M (2018) Learning implicitly recurrent cnns through parameter sharing. In: International Conference on Learning Representations
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1073–1083
Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2018) Neural abstractive text summarization with sequence-to-sequence models: A survey. arXiv pp arXiv–1812
Singh RK, Khetarpaul S, Gorantla R, Allada SG (2020) Sheg: summarization and headline generation of news articles using deep learning. Neural Comput Appl 1–15
Song K, Zhao L, Liu F (2018) Structure-infused copy mechanisms for abstractive summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1717–1729
Song K, Lebanoff L, Guo Q, Qiu X, Xue X, Li C, Yu D, Liu F (2020) Joint parsing and generation for abstractive summarization. Proc AAAI Conf Artif Intell 34:8894–8901
Sutskever I, Vinyals O, Le QV (2014a) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Sutskever I, Vinyals O, Le QV (2014b) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Suzuki J, Nagata M (2017) Cutting-off redundant repeating generations for neural abstractive summarization. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp 291–297
Wang K, Quan X, Wang R (2019) BiSET: Bi-directional selective encoding with template for abstractive summarization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2153–2162
Wang L, Yao J, Tao Y, Zhong L, Liu W, Du Q (2018) A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp 4453–4460
Warrens MJ (2015) Five ways to look at cohen’s kappa. J Psychol Psychother 5(4):1
Warrens MJ (2021) Kappa coefficients for dichotomous-nominal classifications. Adv Data Anal Classif 15(1):193–208
Yang M, Qu Q, Shen Y, Lei K, Zhu J (2020) Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning. Neural Comput Appl 32(11):6421–6433
Yao K, Zweig G, Hwang MY, Shi Y, Yu D (2013) Recurrent neural networks for language understanding. In: Interspeech, pp 2524–2528
Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2020) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3)
Zajic D, Dorr B, Schwartz R (2004) Bbn/umd at duc-2004: Topiary. In: Proceedings of the HLT-NAACL 2004 Document Understanding Workshop, Boston, pp 112–119
Zeng W, Luo W, Fidler S, Urtasun R (2016) Efficient summarization with read-again and copy mechanism. arXiv preprint arXiv:161103382
Zhang B, Xiong D, Su J (2017) A gru-gated attention model for neural machine translation. arXiv preprint arXiv:170408430
Zhang Y, Li D, Wang Y, Fang Y, Xiao W (2019) Abstract text summarization with a convolutional seq2seq model. Appl Sci 9(8):1665
Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1095–1104
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
All authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lal, D.M., Singh, K.P. & Tiwary, U.S. Multi-level shared-weight encoding for abstractive sentence summarization. Neural Comput & Applic 34, 2965–2981 (2022). https://doi.org/10.1007/s00521-021-06566-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06566-7