Low Frequency Words Compression in Neural Conversation System

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10635)

Abstract

Recently, Encoder-Decoder, a framework for sequence-to-sequence (seq2seq) tasks has been widely used in the open domain generation-based conversation system. One of the most difficult challenges in Encoder-Decoder based open domain conversation systems is the Unknown Words Issue, that is, numerous words become out-of-vocabulary words (OOVs) due to the restriction of vocabulary’s volume, while a conversation system always tries to avoid their appearances. This paper proposes a novel approach named Low Frequency Words Compression (LFWC) to address this problem by selectively using K-Components shared symbol for word representations of low frequency words. Compared to the standard Encoder-Decoder works at word-level, our LFWC Encoder-Decoder works at symbol-level, and we propose Sequence Transform to transform a word-level sequence into a symbol-level sequence and LFWC-Predictor to decode from a symbol-level sequence into a word-level sequence. To measure the interference of OOVs in neural conversation system, besides log-perplexity (LP), we apply two more suitable metrics UP-LP and UP-Delta to evaluate the interference of OOVs. The experiment shows that the performance of decoding from compressed symbol-level sequences to word-level sequences achieves a recall@1 score of 60.9%, which is much above 16.7% of baseline, with the strongest compression ratio. It also shows our approach outperforms the standard Encoder-Decoder model in reducing interference of OOVs, which achieves almost the half score of UP-Delta in the most of configurations.

Keywords

seq2seq Conversation system Vocabulary Encoder-Decoder OOVs 

References

  1. 1.
    Schatzmann, J., Weilhammer, K., Stuttle, M., Young, S.: A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowl. Eng. Rev. 21, 97–126 (2006)CrossRefGoogle Scholar
  2. 2.
    Sutskever, I., Vinyals, O., Le, Q.V: Sequence to sequence learning with neural networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 3104–3112. Curran Associates, Inc. (2014)Google Scholar
  3. 3.
    Shang, L., Lu, Z., Li, H.: Neural responding machine for short-text conversation. In: Annual Meeting of the Association for Computational Linguistics, pp. 1577–1586 (2015)Google Scholar
  4. 4.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations, pp. 1–15 (2014)Google Scholar
  5. 5.
    Vinyals, O., Le, Q.: A neural conversational model (2015)Google Scholar
  6. 6.
    Lee, J., Cho, K., Hofmann, T.: Fully character-level neural machine translation without explicit segmentation. In: Annual Meeting of the Association for Computational Linguistics, pp. 1693–1703 (2016)Google Scholar
  7. 7.
    Luong, M.-T., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models. In: Annual Meeting of the Association for Computational Linguistics, pp. 1054–1063 (2016)Google Scholar
  8. 8.
    Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. In: Annual Meeting of the Association for Computational Linguistics, pp. 1–10 (2015)Google Scholar
  9. 9.
    Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Advances in Neural Information Processing Systems, pp. 1–8 (2008)Google Scholar
  10. 10.
    Ahn, S., Choi, H., Pärnamaa, T., Bengio, Y.: A neural knowledge language model. arXiv. 1–12 (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Software and MicroelectronicsPeking UniversityBeijingChina
  2. 2.National Research Center of Software EngineeringPeking UniversityBeijingChina

Personalised recommendations