Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder

Yang, Yunhao; Xue, Zhaokun

doi:10.1007/978-3-030-89906-6_15

Yunhao Yang¹⁰ &
Zhaokun Xue¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 358))

Included in the following conference series:

Proceedings of the Future Technologies Conference

1155 Accesses
1 Citations

Abstract

Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bengio, Y., Courville, A, Vincent, P.: A review and new perspectives, Representation learning (2014)
Google Scholar
Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 2095–2102. AAAI Press (2018)
Google Scholar
Burda, M., Harding, M., Hausman, J.: A Poisson mixture model of discrete choice. J. Econ. 166(2), 184–203 (2012)
Article MathSciNet Google Scholar
Cvetko, T.: Autoencoders for translation (2020)
Google Scholar
Dilokthanakul, N., et al.: Deep unsupervised clustering with Gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648 (2016)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016).http://www.deeplearningbook.org
MATH Google Scholar
Goyal, P., Hu, Z., Liang, X., Wang, C., Xing, E.P.: Nonparametric variational auto-encoders for hierarchical representation learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5094–5102 (2017)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jabi, M., Pedersoli, M., Mitiche, A., Ayed, I.B.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 1887–1896 (2019)
Article Google Scholar
Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2016)
Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences (2014)
Google Scholar
Kortmann, K.-P., Fehsenfeld, M., Wielitzka, M.: Autoencoder-based representation learning from heterogeneous multivariate time series data of mechatronic systems (2021)
Google Scholar
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Article Google Scholar
Li, C., Shi, M., Qu, B., Li, X.: Deep attributed network representation learning via attribute enhanced neighborhood (2021)
Google Scholar
Liang, P., Jordan, M., Klein, D.: Learning dependency-based compositional semantics. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 590–599, Portland, Oregon, USA, June 2011. Association for Computational Linguistics
Google Scholar
Luong, M.-T., Pham, H., Manning, CD.: Effective approaches to attention-based neural machine translation (2015)
Google Scholar
Mosler, K., Seidel, W.: Theory and methods: testing for homogeneity in an exponential mixture model. Aust. NZ J. Stat. 43(2), 231–247 (2001)
Article Google Scholar
Mullov, C., Pham, N.-Q., Waibel, A.: Unsupervised transfer learning in multilingual neural machine translation with cross-lingual word embeddings (2021)
Google Scholar
Oshri, B., Khandwala, N.: There and back again: Autoencoders for textual reconstruction (2015)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics
Google Scholar
Reynolds, D.: Gaussian Mixture Mode, pp. 827–832. Springer, Boston (2015)
Google Scholar
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenom. 404, 132306 (2020)
Article MathSciNet Google Scholar
Sutskever, I., Vinyals, O., Le, Q V.: Sequence to sequence learning with neural networks (2014)
Google Scholar
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.L.: Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: International Conference on Machine Learning, PMLR, pp. 3861–3870 (2017)
Google Scholar
Yang, Y., Whinston, A.: Identifying mislabeled images in supervised learning utilizing autoencoder (2021)
Google Scholar
Yang, Y., Zheng, Y., Wang, Y., Bajaj, C.: Learning deep latent subspaces for image denoising (2021)
Google Scholar
Zelle, J.M., Mooney, R.J.: Learning to parse database queries using inductive logic programming. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2, AAAI 1996, pp. 1050–1055. AAAI Press (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Texas at Austin, Austin, TX, 78705, USA
Yunhao Yang & Zhaokun Xue

Authors

Yunhao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaokun Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunhao Yang .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Xue, Z. (2022). Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder. In: Arai, K. (eds) Proceedings of the Future Technologies Conference (FTC) 2021, Volume 1. FTC 2021. Lecture Notes in Networks and Systems, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-030-89906-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-89906-6_15
Published: 24 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89905-9
Online ISBN: 978-3-030-89906-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics