Skip to main content
Log in

Multi-level shared-weight encoding for abstractive sentence summarization

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Features in a text are hierarchically structured and may not be optimally learned using one-step encoding. Scrutinizing the literature several times facilitates a better understanding of content and helps frame faithful context representations. The proposed model encapsulates the idea of re-examining a piece of text multiple times to grasp the underlying theme and aspects of English grammar before formulating a summary. We suggest a multi-level shared-weight encoder (MSE) that exclusively focuses on the sentence summarization task. MSE exercises a weight-sharing mechanism for proficiently regulating the multi-level encoding process. Weight-sharing helps recognize patterns left undiscovered by single level encoding strategy. We perform experiments with six encoding levels with weight sharing on the renowned short sentence summarization Gigaword and DUC2004 Task1 datasets. The experiments show that MSE generates a more readable(fluent) summary (Rouge-L score) as compared to multiple benchmark models while preserving similar levels of informativeness (Rouge-1 and Rouge-2 scores). Moreover, human evaluation of the generated abstracts also corroborates these assertions of enhanced readability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Amplayo RK, Lim S, Hwang SW (2018) Entity commonsense representation for neural abstractive summarization. In: 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018, Association for Computational Linguistics (ACL), pp 697–707

  2. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473

  3. Cao Z, Wei F, Li W, Li S (2017) Faithful to the original: fact aware neural abstractive summarization. arXiv preprint arXiv:171104434

  4. Cao Z, Li W, Li S, Wei F (2018) Retrieve, rerank and rewrite: Soft template based neural summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 152–161

  5. Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 484–494

  6. Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 93–98

  7. Cohan A, Dernoncourt F, Kim DS, Bui T, Kim S, Chang W, Goharian N (2018) A discourse-aware attention model for abstractive summarization of long documents. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp 615–621

  8. Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) Crhasum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Appl 1–13

  9. Gao Y, Wang Y, Liu L, Guo Y, Huang H (2019) Neural abstractive summarization fusing by global generative topics. Neural Comput Appl 1–10

  10. Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp 1243–1252

  11. Gehrmann S, Deng Y, Rush AM (2018) Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4098–4109

  12. Goodfellow I, Bengio Y, Courville A (2016) Sequence modeling: recurrent and recursive nets. Deep Learning. pp 367–415

  13. Graff D, Kong J, Chen K, Maeda K (2003) English gigaword. Linguistic Data Consortium, Philadelphia 4(1):34

    Google Scholar 

  14. Guo H, Pasunuru R, Bansal M (2018) Soft layer-specific multi-task summarization with entailment and question generation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: LongPapers), pp 687–697

  15. Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient DNNS. In: Advances in neural information processing systems, pp 1379–1387

  16. Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2(3):258–268

    Google Scholar 

  17. Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149

  18. Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M (2016) Controlling output length in neural encoder-decoders. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 1328–1338

  19. Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations, pp 67–72

  20. Lamb AM, Goyal AGAP, Zhang Y, Zhang S, Courville AC, Bengio Y (2016) Professor forcing: A new algorithm for training recurrent networks. In: Advances in neural information processing systems, pp 4601–4609

  21. Lebanoff L, Song K, Liu F (2018) Adapting the neural encoder-decoder framework from single to multi-document summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4131–4141

  22. LeCun Y et al (1989) Generalization and network design strategies. Connect Perspect 19:143–155

    Google Scholar 

  23. Li H, Zhu J, Zhang J, Zong C (2018) Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1430–1441

  24. Li P, Lam W, Bing L, Wang Z (2017) Deep recurrent generative decoder for abstractive text summarization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 2091–2100

  25. Lin CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  26. Lin J, Sun X, Ma S, Su Q (2018) Global encoding for abstractive summarization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 163–169

  27. Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1412–1421

  28. Nallapati R, Zhou B, Gulcehre C, Xiang B et al (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pp 280–290

  29. Napoles C, Gormley MR, Van Durme B (2012) Annotated gigaword. In: Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX), pp 95–100

  30. Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Mining text data, Springer, pp 43–76

  31. Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neural Comput 4(4):473–493

    Article  Google Scholar 

  32. Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: International Conference on Learning Representations

  33. Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp 4095–4104

  34. Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:170605098

  35. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 379–389

  36. Sak H, Shannon M, Rao K, Beaufays F (2017) Recurrent neural aligner: an encoder-decoder neural network model for sequence to sequence mapping. Interspeech 8:1298–1302

    Article  Google Scholar 

  37. Savarese P, Maire M (2018) Learning implicitly recurrent cnns through parameter sharing. In: International Conference on Learning Representations

  38. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  39. See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1073–1083

  40. Shi T, Keneshloo Y, Ramakrishnan N, Reddy CK (2018) Neural abstractive text summarization with sequence-to-sequence models: A survey. arXiv pp arXiv–1812

  41. Singh RK, Khetarpaul S, Gorantla R, Allada SG (2020) Sheg: summarization and headline generation of news articles using deep learning. Neural Comput Appl 1–15

  42. Song K, Zhao L, Liu F (2018) Structure-infused copy mechanisms for abstractive summarization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1717–1729

  43. Song K, Lebanoff L, Guo Q, Qiu X, Xue X, Li C, Yu D, Liu F (2020) Joint parsing and generation for abstractive summarization. Proc AAAI Conf Artif Intell 34:8894–8901

    Google Scholar 

  44. Sutskever I, Vinyals O, Le QV (2014a) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112

  45. Sutskever I, Vinyals O, Le QV (2014b) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112

  46. Suzuki J, Nagata M (2017) Cutting-off redundant repeating generations for neural abstractive summarization. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp 291–297

  47. Wang K, Quan X, Wang R (2019) BiSET: Bi-directional selective encoding with template for abstractive summarization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2153–2162

  48. Wang L, Yao J, Tao Y, Zhong L, Liu W, Du Q (2018) A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp 4453–4460

  49. Warrens MJ (2015) Five ways to look at cohen’s kappa. J Psychol Psychother 5(4):1

  50. Warrens MJ (2021) Kappa coefficients for dichotomous-nominal classifications. Adv Data Anal Classif 15(1):193–208

    Article  MathSciNet  Google Scholar 

  51. Yang M, Qu Q, Shen Y, Lei K, Zhu J (2020) Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning. Neural Comput Appl 32(11):6421–6433

    Article  Google Scholar 

  52. Yao K, Zweig G, Hwang MY, Shi Y, Yu D (2013) Recurrent neural networks for language understanding. In: Interspeech, pp 2524–2528

  53. Yao K, Zhang L, Du D, Luo T, Tao L, Wu Y (2020) Dual encoding for abstractive text summarization. IEEE Trans Cybern 50(3)

  54. Zajic D, Dorr B, Schwartz R (2004) Bbn/umd at duc-2004: Topiary. In: Proceedings of the HLT-NAACL 2004 Document Understanding Workshop, Boston, pp 112–119

  55. Zeng W, Luo W, Fidler S, Urtasun R (2016) Efficient summarization with read-again and copy mechanism. arXiv preprint arXiv:161103382

  56. Zhang B, Xiong D, Su J (2017) A gru-gated attention model for neural machine translation. arXiv preprint arXiv:170408430

  57. Zhang Y, Li D, Wang Y, Fang Y, Xiao W (2019) Abstract text summarization with a convolutional seq2seq model. Appl Sci 9(8):1665

    Article  Google Scholar 

  58. Zhou Q, Yang N, Wei F, Zhou M (2017) Selective encoding for abstractive sentence summarization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1095–1104

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daisy Monika Lal.

Ethics declarations

Conflicts of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lal, D.M., Singh, K.P. & Tiwary, U.S. Multi-level shared-weight encoding for abstractive sentence summarization. Neural Comput & Applic 34, 2965–2981 (2022). https://doi.org/10.1007/s00521-021-06566-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06566-7

Keywords

Navigation