Abstract
Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bengio, Y., Courville, A, Vincent, P.: A review and new perspectives, Representation learning (2014)
Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 2095–2102. AAAI Press (2018)
Burda, M., Harding, M., Hausman, J.: A Poisson mixture model of discrete choice. J. Econ. 166(2), 184–203 (2012)
Cvetko, T.: Autoencoders for translation (2020)
Dilokthanakul, N., et al.: Deep unsupervised clustering with Gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648 (2016)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016).http://www.deeplearningbook.org
Goyal, P., Hu, Z., Liang, X., Wang, C., Xing, E.P.: Nonparametric variational auto-encoders for hierarchical representation learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5094–5102 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Jabi, M., Pedersoli, M., Mitiche, A., Ayed, I.B.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 1887–1896 (2019)
Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2016)
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences (2014)
Kortmann, K.-P., Fehsenfeld, M., Wielitzka, M.: Autoencoder-based representation learning from heterogeneous multivariate time series data of mechatronic systems (2021)
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Li, C., Shi, M., Qu, B., Li, X.: Deep attributed network representation learning via attribute enhanced neighborhood (2021)
Liang, P., Jordan, M., Klein, D.: Learning dependency-based compositional semantics. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 590–599, Portland, Oregon, USA, June 2011. Association for Computational Linguistics
Luong, M.-T., Pham, H., Manning, CD.: Effective approaches to attention-based neural machine translation (2015)
Mosler, K., Seidel, W.: Theory and methods: testing for homogeneity in an exponential mixture model. Aust. NZ J. Stat. 43(2), 231–247 (2001)
Mullov, C., Pham, N.-Q., Waibel, A.: Unsupervised transfer learning in multilingual neural machine translation with cross-lingual word embeddings (2021)
Oshri, B., Khandwala, N.: There and back again: Autoencoders for textual reconstruction (2015)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics
Reynolds, D.: Gaussian Mixture Mode, pp. 827–832. Springer, Boston (2015)
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenom. 404, 132306 (2020)
Sutskever, I., Vinyals, O., Le, Q V.: Sequence to sequence learning with neural networks (2014)
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.L.: Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: International Conference on Machine Learning, PMLR, pp. 3861–3870 (2017)
Yang, Y., Whinston, A.: Identifying mislabeled images in supervised learning utilizing autoencoder (2021)
Yang, Y., Zheng, Y., Wang, Y., Bajaj, C.: Learning deep latent subspaces for image denoising (2021)
Zelle, J.M., Mooney, R.J.: Learning to parse database queries using inductive logic programming. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2, AAAI 1996, pp. 1050–1055. AAAI Press (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, Y., Xue, Z. (2022). Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder. In: Arai, K. (eds) Proceedings of the Future Technologies Conference (FTC) 2021, Volume 1. FTC 2021. Lecture Notes in Networks and Systems, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-030-89906-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-89906-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89905-9
Online ISBN: 978-3-030-89906-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)