Topic Attentional Neural Network for Abstractive Document Summarization

  • Hao Liu
  • Hai-Tao ZhengEmail author
  • Wei Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11440)


Abstractive summarization is a renewed and challenging task of document summarization. Recently, neural networks, especially attentional encoder-docoder architecture, have achieved impressive progress in abstractive document summarization. However, the saliency of summary, which is one of the key factors for document summarization, still needs improvement. In this paper, we propose Topic Attentional Neural Network (TANN) which incorporates topic information into neural networks to tackle this issue. Our model is based on attentional sequence-to-sequence structure but has paired encoders and paired attention mechanisms to deal with original document and topic information in parallel. Moreover, we propose a novel selection method called topic selection. This method uses topic information to improve the standard selection method of beam search and chooses a better candidate as the final summary. We conduct experiments on the CNN/Daily Mail dataset. The results show our model obtains higher ROUGE scores and achieves a competitive performance compared with the state-of-the-art abstractive and extractive models. Human evaluation also demonstrates our model is capable of generating summaries with more informativeness and readability.


Abstractive summarization Neural network Topic information Attention mechanism 



This research is supported by National Natural Science Foundation of China (Grant No. 61773229), Basic Scientific Research Program of Shenzhen City (Grant No. JCYJ20160331184440545), and Overseas Cooperation Research Fund of Graduate School at Shenzhen, Tsinghua University (Grant No. HW2018002).


  1. 1.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. Arch. 3, 993–1022 (2003)zbMATHGoogle Scholar
  2. 2.
    Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Meeting of the Association for Computational Linguistics, pp. 484–494 (2016)Google Scholar
  3. 3.
    Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98 (2016)Google Scholar
  4. 4.
    Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Flick, C.: Rouge: a package for automatic evaluation of summaries. In: The Workshop on Text Summarization Branches Out, p. 10 (2004)Google Scholar
  6. 6.
    Gu, J., Lu, Z., Li, H., Li, V.O.K.: Incorporating copying mechanism in sequence-to-sequence learning. Meeting of the Association for Computational Linguistics, pp. 1631–1640 (2016)Google Scholar
  7. 7.
    Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., Bengio, Y.: Pointing the unknown words. Meeting of the Association for Computational Linguistics, pp. 140–149 (2016)Google Scholar
  8. 8.
    Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Neural Information Processing Systems, pp. 1693–1701 (2015)Google Scholar
  9. 9.
    Li, C., Xu, W., Li, S., Gao, S.: Guiding generation for abstractive text summarization based on key information guide network. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 55–60. Association for Computational Linguistics (2018).
  10. 10.
    McDonald, R.: A study of global inference algorithms in multi-document summarization. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 557–564. Springer, Heidelberg (2007). Scholar
  11. 11.
    Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: National Conference on Artificial Intelligence, pp. 3075–3081 (2017)Google Scholar
  12. 12.
    Nallapati, R., Zhou, B., Santos, C.N.D., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: Conference on Computational Natural Language Learning, pp. 280–290 (2016)Google Scholar
  13. 13.
    Nishikawa, H., Arita, K., Tanaka, K., Hirao, T., Makino, T., Matsuo, Y.: Learning to generate coherent summary with discriminative hidden semi-Markov model. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 1648–1659. Dublin City University and Association for Computational Linguistics (2014).
  14. 14.
    Romain Paulus, C.X., Socher, R.: A deep reinforced model for abstractive summarization. In: The 2018 International Conference on Learning Representations (Submitted for Publication)Google Scholar
  15. 15.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. Empirical Methods in Natural Language Processing, pp. 379–389 (2015)Google Scholar
  16. 16.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  17. 17.
    See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1073–1083. Association for Computational Linguistics (2017).,
  18. 18.
    Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. Neural Information Processing Systems, pp. 2692–2700 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Tsinghua-Southampton Web Science LaboratoryGraduate School at Shenzhen, Tsinghua UniversityShenzhenChina

Personalised recommendations