Abstract
In this chapter, we introduce the notion of word embeddings that serve as core representations of text in deep learning approaches. We start with the distributional hypothesis and explain how it can be leveraged to form semantic representations of words. We discuss the common distributional semantic models including word2vec and GloVe and their variants. We address the shortcomings of embedding models and their extension to document and concept representation. Finally, we discuss several applications to natural language processing tasks and present a case study focused on language modeling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Faisal Alshargi et al. “Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts.” In: CoRR abs/1803.04488 (2018).
Waleed Ammar et al. “Massively Multilingual Word Embeddings.” In: CoRR abs/1602.01925 (2016).
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate”. In: CoRR abs/1409.0473 (2014).
Amir Bakarov. “A Survey of Word Embeddings Evaluation Methods”. In: CoRR abs/1801.09536 (2018).
Yoshua Bengio et al. “A neural probabilistic language model”. In: JMLR (2003), pp. 1137–1155.
Piotr Bojanowski et al. “Enriching Word Vectors with Subword Information”. In: CoRR abs/1607.04606 (2016).
Antoine Bordes et al. “Translating Embeddings for Modeling Multirelational Data.” In: NIPS. 2013, pp. 2787–2795.
José Camacho-Collados and Mohammad Taher Pilehvar. “From Word to Sense Embeddings: A Survey on Vector Representations of Meaning”. In: CoRR abs/1805.04032 (2018).
Ting Chen et al. “Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events.” In: IJCAI. IJCAI/AAAI Press, 2016, pp. 1396–1403.
Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ACM, 2008, pp. 160–167.
Marta R. Costa-Jussà and José A. R. Fonollosa. “Character-based Neural Machine Translation.” In: CoRR abs/1603.00810 (2016).
Jocelyn Coulmance et al. “Trans-gram, Fast Cross-lingual Word embeddings”. In: CoRR abs/1601.02502 (2016).
Jacob Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In: CoRR abs/1810.04805 (2018).
Paramveer S. Dhillon, Dean Foster, and Lyle Ungar. “Multiview learning of word embeddings via cca”. In: In Proc. of NIPS. 2011.
Bhuwan Dhingra et al. “Embedding Text in Hyperbolic Spaces”. In: Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12). Association for Computational Linguistics, 2018, pp. 59–69.
Manaal Faruqui et al. Retrofitting Word Vectors to Semantic Lexicons. 2014.
Edouard Grave et al. “Learning Word Vectors for 157 Languages”. In: CoRR abs/1802.06893 (2018).
Jiatao Gu et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. 2016.
Jeremy Howard and Sebastian Ruder. “Universal Language Model Fine-tuning for Text Classification”. In: Association for Computational Linguistics, 2018.
Armand Joulin et al. “Bag of Tricks for Efficient Text Classification”. In: CoRR abs/1607.01759 (2016).
Ramakrishnan Kannan et al. “Outlier Detection for Text Data: An Extended Version.” In: CoRR abs/1701.01325 (2017).
Yoon Kim et al. “Character-Aware Neural Language Models”. In: AAAI. 2016.
Anoop Kunchukuttan and Pushpak Bhattacharyya. “Learning variable length units for SMT between related languages via Byte Pair Encoding.” In: CoRR abs/1610.06510 (2016).
Maximilian Lam. “Word2Bits - Quantized Word Vectors”. In: CoRR abs/1803.05651 (2018).
Quoc V. Le and Tomas Mikolov. “Distributed Representations of Sentences and Documents”. In: CoRR abs/1405.4053 (2014).
Wang Ling et al. “Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation.” In: CoRR abs/1508.02096 (2015).
Minh-Thang Luong and Christopher D. Manning. “Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models.” In: CoRR abs/1604.00788 (2016).
Tomas Mikolov et al. “Distributed Representations of Words and Phrases and their Compositionality”. In: Advances in Neural Information Processing Systems 26. 2013, pp. 3111–3119.
Andriy Mnih and Geoffrey E Hinton. “A scalable hierarchical distributed language model”. In: Advances in neural information processing systems. 2009, pp. 1081–1088.
Arvind Neelakantan et al. “Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space.” In: EMNLP. ACL, 2014, pp. 1059–1069.
Maximillian Nickel and Douwe Kiela. “Poincaré Embeddings for Learning Hierarchical Representations”. In: Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2017, pp. 6338–6347.
Masataka Ono, Makoto Miwa, and Yutaka Sasaki. “Word Embedding based Antonym Detection using Thesauri and Distributional Information.” In: HLT-NAACL. 2015, pp. 984–989.
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. “GloVe: Global Vectors for Word Representation”. In: Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1532–1543.
Sebastian Ruder, Ivan Vulic, and Anders Sogaard. A Survey Of Cross-lingual Word Embedding Models. 2017.
Tianze Shi and Zhiyuan Liu. “Linking GloVe with word2vec.” In: CoRR abs/1411.5595 (2014).
Andrew Trask, Phil Michalak, and John Liu. “sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.” In: CoRR abs/1511.06388 (2015).
Ashish Vaswani et al. “Attention is all you need”. In: Advances in Neural Information Processing Systems. 2017, pp. 5998–6008.
Luke Vilnis and Andrew McCallum. “Word Representations via Gaussian Embedding.” In: CoRR abs/1412.6623 (2014).
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kamath, U., Liu, J., Whitaker, J. (2019). Distributed Representations. In: Deep Learning for NLP and Speech Recognition . Springer, Cham. https://doi.org/10.1007/978-3-030-14596-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-14596-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14595-8
Online ISBN: 978-3-030-14596-5
eBook Packages: Computer ScienceComputer Science (R0)