Deep Text Generation – Using Hierarchical Decomposition to Mitigate the Effect of Rare Data Points

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10318)

Abstract

Deep learning has recently been adopted for the task of natural language generation (NLG) and shown remarkable results. However, learning can go awry when the input dataset is too small or not well balanced with regards to the examples it contains for various input sequences. This is relevant to naturally occurring datasets such as many that were not prepared for the task of natural language processing but scraped off the web and originally prepared for a different purpose. As a mitigation to the problem of unbalanced training data, we therefore propose to decompose a large natural language dataset into several subsets that “talk about” the same thing. We show that the decomposition helps to focus each learner’s attention during training. Results from a proof-of-concept study show 73% times faster learning over a flat model and better results.

Keywords

Artificial intelligence Natural language processing Deep learning 

References

  1. 1.
    Angeli, G., Liang, P., Klein, D.: A simple domain-independent probabilistic approach to generation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Cambridge, Massachusetts (2010)Google Scholar
  2. 2.
    Belz, A., Gatt, A.: Intrinsic vs. extrinsic evaluation measures for referring expression generation. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL), Columbus, OH, USA (2008)Google Scholar
  3. 3.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  4. 4.
    Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar (2014)Google Scholar
  5. 5.
    Cuayáhuitl, H., Dethlefs, N., Hastie, H., Liu, X.: Training a statistical surface realiser from automatic slot labelling. In: Proceedings of the IEEE Workshop on Spoken Language Technology (SLT), South Lake Tahoe, USA (2014)Google Scholar
  6. 6.
    Dethlefs, N., Cuayáhuitl, H.: Hierarchical reinforcement learning and hidden markov models for task-oriented natural language generation. In: Proceedings of the 49th Annual Conference of the Association for Computational Linguistics (ACL-HLT), Short Papers, Portland, OR, USA (2011)Google Scholar
  7. 7.
    Dethlefs, N., Cuayáhuitl, H.: Hierarchical reinforcement learning for situated natural language generation. Natl. Lang. Eng. 21, 391–435 (2015)CrossRefGoogle Scholar
  8. 8.
    Dethlefs, N., Hastie, H., Cuayáhuitl, H., Lemon, O.: Conditional random fields for responsive surface realisation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria (2013)Google Scholar
  9. 9.
    Dusek, O., Jurcicek, F.: Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany (2016)Google Scholar
  10. 10.
    Graves, A.: Generating sequences with recurrent neural networks. CoRR abs/1308.0850 (2013). http://arxiv.org/abs/1308.0850
  11. 11.
    Konstas, I., Lapata, M.: Unsupervised concept-to-text generation with hypergraphs. In: Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), Montreal, Canada (2012)Google Scholar
  12. 12.
    Liang, P., Jordan, M., Klein, D.: Learning semantic correspondences with less supervision. In: Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL), Singapore (2009)Google Scholar
  13. 13.
    Mairesse, F., Jurčíček, F., Keizer, S., Thomson, B., Yu, K., Young, S.: Phrase-based statistical language generation using graphical models and active learning. In: Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL), Uppsala, Sweden (2010)Google Scholar
  14. 14.
    Mei, H., Bansal, M., Walker, M.: What to talk about and how? Selective Generation using LSTMs with coarse-to-fine alignment. In: Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), San Diego, CA, USA (2016)Google Scholar
  15. 15.
    Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 246–252 (2005)Google Scholar
  16. 16.
    Novikova, J., Rieser, V.: The aNALoGuE Challenge: Non Aligned Language GEneration. In: Proceedings of the 9th International Natural Language Generation Conference (INLG) (2016)Google Scholar
  17. 17.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL), Association for Computational Linguistics, pp. 311–318 (2001)Google Scholar
  18. 18.
    Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, New York (2000)CrossRefGoogle Scholar
  19. 19.
    Snyder, B., Barzilay, R.: Database-text alignment via structured multilabel classification. In: Proceedings of 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India (2007)Google Scholar
  20. 20.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. (NIPS) 27, 3104–3112 (2014)Google Scholar
  21. 21.
    Turner, A.P., Caves, L.S., Stepney, S., Tyrrell, A.M., Lones, M.A.: Artificial epigenetic networks: automatic decomposition of dynamical control tasks using topological self-modification. IEEE Trans. Neural Netw. Learn. Syst. (2016)Google Scholar
  22. 22.
    Turner, A.P., Lones, M.A., Fuente, L.A., Stepney, S., Caves, L.S., Tyrrell, A.M.: The artificial epigenetic network. In: 2013 IEEE International Conference on Evolvable Systems (ICES), pp. 66–72. IEEE (2013)Google Scholar
  23. 23.
    Walker, M., Stent, A., Mairesse, F., Prasad, R.: Individual and domain adaptation in sentence planning for dialogue. J. Artif. Intell. Res. 30(1), 413–456 (2007)MATHGoogle Scholar
  24. 24.
    Wen, T.H., Gašić, M., Mrkšić, N., Su, P.H., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Engineering and Computer ScienceUniversity of HullHullUK

Personalised recommendations