Skip to main content

Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder

  • Conference paper
  • First Online:
Proceedings of the Future Technologies Conference (FTC) 2021, Volume 1 (FTC 2021)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 358))

Included in the following conference series:

Abstract

Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengio, Y., Courville, A, Vincent, P.: A review and new perspectives, Representation learning (2014)

    Google Scholar 

  2. Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 2095–2102. AAAI Press (2018)

    Google Scholar 

  3. Burda, M., Harding, M., Hausman, J.: A Poisson mixture model of discrete choice. J. Econ. 166(2), 184–203 (2012)

    Article  MathSciNet  Google Scholar 

  4. Cvetko, T.: Autoencoders for translation (2020)

    Google Scholar 

  5. Dilokthanakul, N., et al.: Deep unsupervised clustering with Gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648 (2016)

  6. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016).http://www.deeplearningbook.org

    MATH  Google Scholar 

  7. Goyal, P., Hu, Z., Liang, X., Wang, C., Xing, E.P.: Nonparametric variational auto-encoders for hierarchical representation learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5094–5102 (2017)

    Google Scholar 

  8. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  9. Jabi, M., Pedersoli, M., Mitiche, A., Ayed, I.B.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 1887–1896 (2019)

    Article  Google Scholar 

  10. Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2016)

    Google Scholar 

  11. Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences (2014)

    Google Scholar 

  12. Kortmann, K.-P., Fehsenfeld, M., Wielitzka, M.: Autoencoder-based representation learning from heterogeneous multivariate time series data of mechatronic systems (2021)

    Google Scholar 

  13. Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)

    Article  Google Scholar 

  14. Li, C., Shi, M., Qu, B., Li, X.: Deep attributed network representation learning via attribute enhanced neighborhood (2021)

    Google Scholar 

  15. Liang, P., Jordan, M., Klein, D.: Learning dependency-based compositional semantics. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 590–599, Portland, Oregon, USA, June 2011. Association for Computational Linguistics

    Google Scholar 

  16. Luong, M.-T., Pham, H., Manning, CD.: Effective approaches to attention-based neural machine translation (2015)

    Google Scholar 

  17. Mosler, K., Seidel, W.: Theory and methods: testing for homogeneity in an exponential mixture model. Aust. NZ J. Stat. 43(2), 231–247 (2001)

    Article  Google Scholar 

  18. Mullov, C., Pham, N.-Q., Waibel, A.: Unsupervised transfer learning in multilingual neural machine translation with cross-lingual word embeddings (2021)

    Google Scholar 

  19. Oshri, B., Khandwala, N.: There and back again: Autoencoders for textual reconstruction (2015)

    Google Scholar 

  20. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics

    Google Scholar 

  21. Reynolds, D.: Gaussian Mixture Mode, pp. 827–832. Springer, Boston (2015)

    Google Scholar 

  22. Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenom. 404, 132306 (2020)

    Article  MathSciNet  Google Scholar 

  23. Sutskever, I., Vinyals, O., Le, Q V.: Sequence to sequence learning with neural networks (2014)

    Google Scholar 

  24. Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.L.: Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: International Conference on Machine Learning, PMLR, pp. 3861–3870 (2017)

    Google Scholar 

  25. Yang, Y., Whinston, A.: Identifying mislabeled images in supervised learning utilizing autoencoder (2021)

    Google Scholar 

  26. Yang, Y., Zheng, Y., Wang, Y., Bajaj, C.: Learning deep latent subspaces for image denoising (2021)

    Google Scholar 

  27. Zelle, J.M., Mooney, R.J.: Learning to parse database queries using inductive logic programming. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2, AAAI 1996, pp. 1050–1055. AAAI Press (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunhao Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Y., Xue, Z. (2022). Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder. In: Arai, K. (eds) Proceedings of the Future Technologies Conference (FTC) 2021, Volume 1. FTC 2021. Lecture Notes in Networks and Systems, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-030-89906-6_15

Download citation

Publish with us

Policies and ethics