Multi-level shared-weight encoding for abstractive sentence summarization

Lal, Daisy Monika; Singh, Krishna Pratap; Tiwary, Uma Shanker

doi:10.1007/s00521-021-06566-7

Multi-level shared-weight encoding for abstractive sentence summarization

Original Article
Published: 05 October 2021

Volume 34, pages 2965–2981, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Daisy Monika Lal ORCID: orcid.org/0000-0001-6407-6184¹,
Krishna Pratap Singh¹ &
Uma Shanker Tiwary²

341 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Features in a text are hierarchically structured and may not be optimally learned using one-step encoding. Scrutinizing the literature several times facilitates a better understanding of content and helps frame faithful context representations. The proposed model encapsulates the idea of re-examining a piece of text multiple times to grasp the underlying theme and aspects of English grammar before formulating a summary. We suggest a multi-level shared-weight encoder (MSE) that exclusively focuses on the sentence summarization task. MSE exercises a weight-sharing mechanism for proficiently regulating the multi-level encoding process. Weight-sharing helps recognize patterns left undiscovered by single level encoding strategy. We perform experiments with six encoding levels with weight sharing on the renowned short sentence summarization Gigaword and DUC2004 Task1 datasets. The experiments show that MSE generates a more readable(fluent) summary (Rouge-L score) as compared to multiple benchmark models while preserving similar levels of informativeness (Rouge-1 and Rouge-2 scores). Moreover, human evaluation of the generated abstracts also corroborates these assertions of enhanced readability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

Article 27 November 2023

References

Amplayo RK, Lim S, Hwang SW (2018) Entity commonsense representation for neural abstractive summarization. In: 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018, Association for Computational Linguistics (ACL), pp 697–707
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473
Cao Z, Wei F, Li W, Li S (2017) Faithful to the original: fact aware neural abstractive summarization. arXiv preprint arXiv:171104434
Cao Z, Li W, Li S, Wei F (2018) Retrieve, rerank and rewrite: Soft template based neural summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 152–161
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 484–494
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 93–98
Cohan A, Dernoncourt F, Kim DS, Bui T, Kim S, Chang W, Goharian N (2018) A discourse-aware attention model for abstractive summarization of long documents. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp 615–621
Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) Crhasum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Appl 1–13
Gao Y, Wang Y, Liu L, Guo Y, Huang H (2019) Neural abstractive summarization fusing by global generative topics. Neural Comput Appl 1–10
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp 1243–1252
Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4098–4109
Goodfellow I, Bengio Y, Courville A (2016) Sequence modeling: recurrent and recursive nets. Deep Learning. pp 367–415
Graff D, Kong J, Chen K, Maeda K (2003) English gigaword. Linguistic Data Consortium, Philadelphia 4(1):34
Google Scholar
Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: LongPapers), pp 687–697
Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient DNNS. In: Advances in neural information processing systems, pp 1379–1387
Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2(3):258–268
Google Scholar
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149
Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M (2016) Controlling output length in neural encoder-decoders. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 1328–1338
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations, pp 67–72
Lamb AM, Goyal AGAP, Zhang Y, Zhang S, Courville AC, Bengio Y (2016) Professor forcing: A new algorithm for training recurrent networks. In: Advances in neural information processing systems, pp 4601–4609
Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4131–4141
LeCun Y et al (1989) Generalization and network design strategies. Connect Perspect 19:143–155
Google Scholar
Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1430–1441
Li P, Lam W, Bing L, Wang Z (2017) Deep recurrent generative decoder for abstractive text summarization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 2091–2100
Lin CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Lin J, Sun X, Ma S, Su Q (2018) Global encoding for abstractive summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 163–169
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1412–1421
Nallapati R, Zhou B, Gulcehre C, Xiang B et al (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pp 280–290
Napoles C, Gormley MR, Van Durme B (2012) Annotated gigaword. In: Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX), pp 95–100
Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Mining text data, Springer, pp 43–76
Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neural Comput 4(4):473–493
Article Google Scholar
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: International Conference on Learning Representations
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp 4095–4104
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:170605098
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 379–389
Sak H, Shannon M, Rao K, Beaufays F (2017) Recurrent neural aligner: an encoder-decoder neural network model for sequence to sequence mapping. Interspeech 8:1298–1302
Article Google Scholar
Savarese P, Maire M (2018) Learning implicitly recurrent cnns through parameter sharing. In: International Conference on Learning Representations
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1073–1083
Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2018) Neural abstractive text summarization with sequence-to-sequence models: A survey. arXiv pp arXiv–1812
Singh RK, Khetarpaul S, Gorantla R, Allada SG (2020) Sheg: summarization and headline generation of news articles using deep learning. Neural Comput Appl 1–15
Song K, Zhao L, Liu F (2018) Structure-infused copy mechanisms for abstractive summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1717–1729
Song K, Lebanoff L, Guo Q, Qiu X, Xue X, Li C, Yu D, Liu F (2020) Joint parsing and generation for abstractive summarization. Proc AAAI Conf Artif Intell 34:8894–8901
Google Scholar
Sutskever I, Vinyals O, Le QV (2014a) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Sutskever I, Vinyals O, Le QV (2014b) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Suzuki J, Nagata M (2017) Cutting-off redundant repeating generations for neural abstractive summarization. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp 291–297
Wang K, Quan X, Wang R (2019) BiSET: Bi-directional selective encoding with template for abstractive summarization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2153–2162
Wang L, Yao J, Tao Y, Zhong L, Liu W, Du Q (2018) A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp 4453–4460
Warrens MJ (2015) Five ways to look at cohen’s kappa. J Psychol Psychother 5(4):1
Warrens MJ (2021) Kappa coefficients for dichotomous-nominal classifications. Adv Data Anal Classif 15(1):193–208
Article MathSciNet Google Scholar
Yang M, Qu Q, Shen Y, Lei K, Zhu J (2020) Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning. Neural Comput Appl 32(11):6421–6433
Article Google Scholar
Yao K, Zweig G, Hwang MY, Shi Y, Yu D (2013) Recurrent neural networks for language understanding. In: Interspeech, pp 2524–2528
Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2020) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3)
Zajic D, Dorr B, Schwartz R (2004) Bbn/umd at duc-2004: Topiary. In: Proceedings of the HLT-NAACL 2004 Document Understanding Workshop, Boston, pp 112–119
Zeng W, Luo W, Fidler S, Urtasun R (2016) Efficient summarization with read-again and copy mechanism. arXiv preprint arXiv:161103382
Zhang B, Xiong D, Su J (2017) A gru-gated attention model for neural machine translation. arXiv preprint arXiv:170408430
Zhang Y, Li D, Wang Y, Fang Y, Xiao W (2019) Abstract text summarization with a convolutional seq2seq model. Appl Sci 9(8):1665
Article Google Scholar
Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1095–1104

Download references

Author information

Authors and Affiliations

Machine Learning and Optimization Lab, Department of IT, IIIT Allahabad, Prayagraj, Uttar Pradesh, 211012, India
Daisy Monika Lal & Krishna Pratap Singh
Speech Image and Language Processing Lab, Department of IT, IIIT Allahabad, Prayagraj, Uttar Pradesh, 211012, India
Uma Shanker Tiwary

Authors

Daisy Monika Lal
View author publications
You can also search for this author in PubMed Google Scholar
Krishna Pratap Singh
View author publications
You can also search for this author in PubMed Google Scholar
Uma Shanker Tiwary
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daisy Monika Lal.

Ethics declarations

Conflicts of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lal, D.M., Singh, K.P. & Tiwary, U.S. Multi-level shared-weight encoding for abstractive sentence summarization. Neural Comput & Applic 34, 2965–2981 (2022). https://doi.org/10.1007/s00521-021-06566-7

Download citation

Received: 25 December 2020
Accepted: 20 September 2021
Published: 05 October 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00521-021-06566-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level shared-weight encoding for abstractive sentence summarization

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level shared-weight encoding for abstractive sentence summarization

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation