Advertisement

A Hierarchical Hybrid Neural Network Architecture for Chinese Text Summarization

  • Yunheng Zhang
  • Leihan Zhang
  • Ke Xu
  • Le Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11221)

Abstract

Using sequence-to-sequence models for abstractive text summarization is generally plagued by three problems: inability to deal with out-of-vocabulary words, repetition in summaries and time-consuming in training. The paper proposes a hierarchical hybrid neural network architecture for Chinese text summarization. Three mechanisms, hierarchical attention mechanism, pointer mechanism and coverage mechanism, are integrated into the architecture to improve the performance of summarization. The proposed model is applied to Chinese news headline generation. The experimental results suggest that the model outperforms the baseline in ROUGE scores and the three mechanisms can improve the quality of summaries.

Keywords

Abstractive text summarization Hierarchical attention mechanism Pointer mechanism Coverage mechanism 

References

  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. \(\text{arXiv}\)\(\text{ e-prints }\)\(\text{ abs/1409.0473 }\), September 2014Google Scholar
  2. 2.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics (2014)Google Scholar
  3. 3.
    Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the NAACL: Human Language Technologies, pp. 93–98. Association for Computational Linguistics (2016)Google Scholar
  4. 4.
    Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1631–1640. Association for Computational Linguistics (2016)Google Scholar
  5. 5.
    Guo, Y., Huang, H., Gao, Y., Lu, C.: Conceptual multi-layer neural network model for headline generation. In: Sun, M., Wang, X., Chang, B., Xiong, D. (eds.) CCL/NLP-NABD -2017. LNCS (LNAI), vol. 10565, pp. 355–367. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-69005-6_30CrossRefGoogle Scholar
  6. 6.
    Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Marie-Francine Moens, S.S. (ed.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81. Association for Computational Linguistics, Barcelona, July 2004Google Scholar
  7. 7.
    Ma, S., Sun, X., Xu, J., Wang, H., Li, W., Su, Q.: Improving semantic relevance for sequence-to-sequence learning of Chinese social media text summarization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 635–640. Association for Computational Linguistics (2017)Google Scholar
  8. 8.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, vol. 2, pp. 3111–3119. Curran Associates, Inc., Red Hook (2013)Google Scholar
  9. 9.
    Nallapati, R., Zhou, B., dos Santos, C., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 280–290. Association for Computational Linguistics (2016)Google Scholar
  10. 10.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. Association for Computational Linguistics (2015)Google Scholar
  11. 11.
    See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1073–1083. Association for Computational Linguistics (2017)Google Scholar
  12. 12.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, vol. 2, pp. 3104–3112. MIT Press, Cambridge (2014)Google Scholar
  13. 13.
    Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432. Association for Computational Linguistics (2015)Google Scholar
  14. 14.
    Tu, Z., Lu, Z., Liu, Y., Liu, X., Li, H.: Modeling coverage for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 76–85. Association for Computational Linguistics (2016)Google Scholar
  15. 15.
    Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems 28, pp. 2692–2700. Curran Associates, Inc., Red Hook (2015)Google Scholar
  16. 16.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the NAACL: Human Language Technologies, pp. 1480–1489. Association for Computational Linguistics (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.State Key Laboratory of Software Development EnvironmentBeihang UniversityBeijingChina
  2. 2.School of Economics and ManagementBeijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations