Skip to main content

Representing Words in Vector Space and Beyond

  • Chapter
  • First Online:
Quantum-Like Models for Information Retrieval and Decision-Making

Abstract

Representing words, the basic units in language, is one of the most fundamental concerns in Information Retrieval, Natural Language Processing (NLP), and related fields. In this paper, we reviewed most of the approaches of word representation in vector space (especially state-of-the-art word embedding) and their related downstream applications. The limitations, trends and their connection to traditional vector space based approaches are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://code.google.com/archive/p/word2vec/.

  2. 2.

    The words vectors are downloaded from http://nlp.stanford.edu/data/glove.6B.zip, with 6B tokens, 400K uncased words, and 50-dimensional vectors.

  3. 3.

    https://nlp.stanford.edu/projects/glove/.

  4. 4.

    An example of hierarchical structures is shown at the following address: http://people.csail.mit.edu/torralba/research/LabelMe/wordnet/test.html.

  5. 5.

    http://www.meta-net.eu/events/meta-forum-2016/slides/09_sennrich.pdf.

  6. 6.

    https://blog.google/products/translate/found-translation-more-accurate-fluent-sentences-google-translate/.

  7. 7.

    https://aclweb.org/aclwiki/Question_Answering_(State_of_the_art).

  8. 8.

    http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-1.html.

  9. 9.

    https://nlp.stanford.edu/manning/talks/Simons-Institute-Manning-2017.pdf.

  10. 10.

    The shifted PMI matrix is “the well-known word-context PMI matrix from the word-similarity literature, shifted by a constant offset” [60].

  11. 11.

    Trajectories where based on the cosine distance between two words representation over time.

References

  1. Aggarwal, C. C., & Zhai, C.-X. (2012). A survey of text classification algorithms. In Mining text data (pp. 163–222). Berlin: Springer.

    Chapter  Google Scholar 

  2. Athiwaratkun, B., & Wilson, A. G. (2017). Multimodal word distributions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (pp. 1645–1656).

    Google Scholar 

  3. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. Preprint. arXiv:1409.0473.

    Google Scholar 

  4. Bamler, R., & Mandt, S. (2017). Dynamic word embeddings. Preprint. arXiv:1702.08359.

    Google Scholar 

  5. Barkan, O. (2017). Bayesian neural word embedding. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (pp. 3135–3143).

    Google Scholar 

  6. Barkan, O., & Koenigstein, N. (2016). Item2Vec: Neural item embedding for collaborative filtering. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) (pp. 1–6). Piscataway: IEEE.

    Google Scholar 

  7. Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. The Journal of Machine Learning Research, 3, 1137–1155.

    MATH  Google Scholar 

  8. Bian, W., Li, S., Yang, Z., Chen, G., & Lin, Z. (2017). A compare-aggregate model with dynamic-clip attention for answer selection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1987–1990). New York: ACM.

    Google Scholar 

  9. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77 (2012).

    Article  Google Scholar 

  10. Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning - ICML ’06 (pp. 113–120). New York: ACM.

    Chapter  Google Scholar 

  11. Blei, D. M., & Lafferty, J. D. (2009). Topic models. In A. Srivastava & M. Sahami (Eds.), Text mining: classification, clustering, and applications. Data mining and knowledge discovery series. London: Chapman & Hall.

    Google Scholar 

  12. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  13. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.

    Article  Google Scholar 

  14. Bowman, S. R., Angeli, G., Potts, C., & Manning C. D. (2015). A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  15. Boyd-Graber, J., Hu, Y., & Mimno, D. (2017). Applications of topic models. Foundations and Trends® in Information Retrieval, 11(2–3), 143–296.

    Article  Google Scholar 

  16. Bréal, M. (1897). Essai de Sémantique: Science des significations. Paris: Hachette.

    Google Scholar 

  17. Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4), 467–479.

    Google Scholar 

  18. Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. Preprint. arXiv:1409.1259.

    Google Scholar 

  19. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, Schwenk, F. H., et al. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Preprint. arXiv:1406.1078.

    Google Scholar 

  20. Choi, E., He, H., Iyyer, M., Yatskar, M., Yih, W.-T., Choi, Y., et al. (2018). QuAC: Question answering in context. Preprint. arXiv:1808.07036.

    Google Scholar 

  21. Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning (pp. 160–167). New York: ACM.

    Google Scholar 

  22. Cui, H., Sun, R., Li, K., Kan, M.-Y., & Chua, T.-S. (2005). Question answering passage retrieval using dependency relations. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 400–407). New York: ACM.

    Google Scholar 

  23. Cui, P., Wang, X., Pei, J., & Zhu, W. (2018). A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering, 31(5), 833–852.

    Article  Google Scholar 

  24. Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407.

    Article  Google Scholar 

  25. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Preprint. arXiv:1810.04805.

    Google Scholar 

  26. Ding, Y., Liu, Y., Luan, H., & Sun, M. (2017). Visualizing and understanding neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 1150–1159).

    Google Scholar 

  27. Dolan, W. B., & Brockett, C. (2005). Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).

    Google Scholar 

  28. Dubey, A., Hefny, A., Williamson, S., & Xing, E. P. (2013). A nonparametric mixture model for topic modeling over time. In Proceedings of the 2013 SIAM International Conference on Data Mining (pp. 530–538).

    Google Scholar 

  29. Dubossarsky, H., Weinshall, D., & Grossman, E. (2017). Outta control: Laws of semantic change and inherent biases in word representation models. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (vol. 1, pp. 1136–1145). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  30. Faruqui, M., Dodge, J., Jauhar, S. K., Dyer, C., Hovy, E., & Smith, N. A. (2014). Retrofitting word vectors to semantic lexicons. Preprint. arXiv:1411.4166.

    Google Scholar 

  31. Fellbaum, C. (2000). WordNet: An electronic lexical database. Language, 76(3), 706.

    Article  MATH  Google Scholar 

  32. Feng, M., Xiang, B., Glass, M. R., Wang, L., & Zhou, B. (2015). Applying deep learning to answer selection: A study and an open task. Preprint. arXiv:1508.01585.

    Google Scholar 

  33. Firth, J. R. (1957). A synopsis of linguistic theory 1930–55. In Studies in Linguistic Analysis (special volume of the Philological Society) (vol. 1952–59, pp. 1–32). Oxford: The Philological Society.

    Google Scholar 

  34. Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. Preprint. arXiv:1705.03122.

    Google Scholar 

  35. Gittens, A., Achlioptas, D., & Mahoney, M. W. (2017). Skip-gram-zipf+ uniform= vector additivity. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 69–76).

    Google Scholar 

  36. Goller, C., & Kuchler, A. (1996). Learning task-dependent distributed representations by backpropagation through structure. Neural Networks, 1, 347–352.

    Google Scholar 

  37. Hamilton, W. L., Leskovec, J., & Jurafsky, D. (2016). Diachronic word embeddings reveal statistical laws of semantic change. Preprint. arXiv:1605.09096.

    Google Scholar 

  38. Harris, Z. S. (1954). Distributional structure. Word, 10(2–3), 146–162.

    Article  Google Scholar 

  39. He, H., Gimpel, K., & Lin, J. (2015). Multi-perspective sentence similarity modeling with convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1576–1586).

    Google Scholar 

  40. He, H., & Lin, J. (2016). Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 937–948).

    Google Scholar 

  41. He, K., Girshick, R., & Dollár, P. (2018). Rethinking ImageNet pre-training. Preprint. arXiv:1811.08883.

    Google Scholar 

  42. Heilman, M., & Smith, N. A. (2010). Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 1011–1019). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  43. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  Google Scholar 

  44. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’99 (pp. 50–57). New York: ACM.

    Chapter  Google Scholar 

  45. Hu, B., Lu, Z., Li, H., & Chen, Q. (2014). Convolutional neural network architectures for matching natural language sentences. In Advances in Neural Information Processing Systems (pp. 2042–2050).

    Google Scholar 

  46. Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging. Preprint. arXiv:1508.01991.

    Google Scholar 

  47. Iyyer, M., Yih, W.-T., & Chang, M.-W. (2017). Search-based neural structured learning for sequential question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 1821–1831).

    Google Scholar 

  48. Joshi, M., Choi, E., Weld, D. S., & Zettlemoyer, L. (2017). TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. Preprint. arXiv:1705.03551.

    Google Scholar 

  49. Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers (pp. 427–431). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  50. Kiela, D., Wang, C., & Cho, K. (2018). Dynamic meta-embeddings for improved sentence representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 1466–1477).

    Google Scholar 

  51. Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1746–1751).

    Google Scholar 

  52. Kim, Y. (2014). Convolutional neural networks for sentence classification. Preprint. arXiv:1408.5882.

    Google Scholar 

  53. Kočiskỳ, T., Schwarz, J., Blunsom, P., Dyer, C., Hermann, K. M., Melis, G., et al. (2018). The narrativeQA reading comprehension challenge. Transactions of the Association of Computational Linguistics, 6, 317–328.

    Article  Google Scholar 

  54. Kudo, T. (2018). Subword regularization: Improving neural network translation models with multiple subword candidates. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.

    Google Scholar 

  55. Kulkarni, V., Al-Rfou, R., Perozzi, B., & Skiena, S. (2015). Statistically significant detection of linguistic change. In Proceedings of the 24th International Conference on World Wide Web (pp. 625–635).

    Google Scholar 

  56. Kutuzov, A., Øvrelid, L., Szymanski, T., & Velldal, E. (2018). Diachronic word embeddings and semantic shifts: A survey. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1384–1397). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  57. Kwiatkowski, T., Palomaki, J., Rhinehart, O., Collins, M., Parikh, A., Alberti, C., et al. (2019). Natural questions: A benchmark for question answering research. Transactions of the Association of Computational Linguistics (to appear). https://tomkwiat.users.x20web.corp.google.com/papers/natural-questions/main-1455-kwiatkowski.pdf

  58. Lai, S., Liu, K., He, S., & Zhao, J. (2016). How to generate a good word embedding. IEEE Intelligent Systems, 31(6), 5–14.

    Article  Google Scholar 

  59. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural architectures for named entity recognition. Preprint. arXiv:1603.01360.

    Google Scholar 

  60. Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems (pp. 2177–2185).

    Google Scholar 

  61. Li, J., Hu, R., Liu, X., Tiwari, P., Pandey, H. M., Chen, W., et al. (2019). A distant supervision method based on paradigmatic relations for learning word embeddings. Neural Computing and Applications. https://doi.org/10.1007/s00521-019-04071-6

  62. Li, Q., Uprety, S., Wang, B., & Song, D. (2018). Quantum-inspired complex word embedding. In Proceedings of the Third Workshop on Representation Learning for NLP, Melbourne (pp. 50–57). Stroudsburg: Association for Computational Linguistics.

    Chapter  Google Scholar 

  63. Lin, D., & Wu, X. (2009). Phrase clustering for discriminative learning. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 (pp. 1030–1038). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  64. Lin, R., Liu, S., Yang, M., Li, M., Zhou, M., & Li, S. (2015). Hierarchical recurrent neural network for document modeling. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 899–907).

    Google Scholar 

  65. Lipton, Z. C. (2016). The mythos of model interpretability. Preprint. arXiv:1606.03490.

    Google Scholar 

  66. Lucy, L., & Gauthier, J. (2017). Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning. Preprint. arXiv:1705.11168.

    Google Scholar 

  67. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, and Computers, 28(2), 203–208.

    Article  Google Scholar 

  68. Melucci, M. (2015). Introduction to information retrieval and quantum mechanics. Berlin: Springer.

    Book  MATH  Google Scholar 

  69. Melucci, M., & van Rijsbergen, C. J. (2011). Quantum mechanics and information retrieval (chap. 6, pp. 125–155). Berlin: Springer.

    Google Scholar 

  70. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. Preprint. arXiv:1301.3781.

    Google Scholar 

  71. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (pp. 3111–3119).

    Google Scholar 

  72. Mimno, D. (2012). Computational historiography. Journal on Computing and Cultural Heritage, 5(1), 1–19.

    Article  Google Scholar 

  73. Mitra, B., & Craswell, N. (2017). Neural models for information retrieval. Preprint. arXiv:1705.01509.

    Google Scholar 

  74. Mitra, B., & Craswell, N. (2018). An introduction to neural information retrieval.Foundations and Trends in Information Retrieval, 13(1), 1–126.

    Article  Google Scholar 

  75. Mnih, A., & Hinton, G. (2007). Three new graphical models for statistical language modelling. In Proceedings of the 24th International Conference on Machine Learning (pp. 641–648). New York: ACM.

    Google Scholar 

  76. Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., et al. (2016). MS MARCO: A human generated machine reading comprehension dataset. Preprint. arXiv:1611.09268.

    Google Scholar 

  77. Nickel, M., & Kiela, D. (2017). Poincaré embeddings for learning hierarchical representations. In Advances in Neural Information Processing Systems (pp. 6338–6347).

    Google Scholar 

  78. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (vol. 14, pp. 1532–1543).

    Google Scholar 

  79. Pereira, F., Tishby, N., & Lee, L. (1993). Distributional clustering of English words. In Proceedings of the 31st Annual Meeting on Association for Computational Linguistics (pp. 183–190). Stroudsburg: Association for Computational Linguistics.

    Chapter  Google Scholar 

  80. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., et al. (2018). Deep contextualized word representations. Preprint. arXiv:1802.05365.

    Google Scholar 

  81. Pollack, J. B. (1990). Recursive distributed representations. Artificial Intelligence, 46(1–2), 77–105.

    Article  Google Scholar 

  82. Punyakanok, V., Roth, D., & Yih, W.-T. (2004). Mapping dependencies trees: An application to question answering. In Proceedings of AI&Math 2004 (pp. 1–10).

    Google Scholar 

  83. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/languageunderstanding_paper.pdf

    Google Scholar 

  84. Rajpurkar, P., Jia, R., & Liang, P. (2018). Know what you don’t know: Unanswerable questions for squad. Preprint. arXiv:1806.03822.

    Google Scholar 

  85. Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ questions for machine comprehension of text. Preprint. arXiv:1606.05250.

    Google Scholar 

  86. Rao, J., He, H., & Lin, J. (2016). Noise-contrastive estimation for answer selection with deep neural networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 1913–1916). New York: ACM.

    Google Scholar 

  87. Reddy, S., Chen, D., & Manning, C. D. (2018). CoQA: A conversational question answering challenge. Preprint. arXiv:1808.07042.

    Google Scholar 

  88. Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389.

    Article  Google Scholar 

  89. Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.

    Article  Google Scholar 

  90. Rudolph, M., & Blei, D. (2018). Dynamic Bernoulli embeddings for language evolution. In Proceedings of the 2018 World Wide Web Conference (pp. 1003–1011).

    Google Scholar 

  91. Rudolph, M., Ruiz, F., Mandt, S., & Blei, D. (2016). Exponential family embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16 (pp. 478–486). Red Hook: Curran Associates Inc.

    Google Scholar 

  92. Saha, A., Pahuja, V., Khapra, M. M., Sankaranarayanan, K., & Chandar, S. (2018). Complex sequential question answering: Towards learning to converse over linked question answer pairs with a knowledge graph. Preprint. arXiv:1801.10314.

    Google Scholar 

  93. Sala, F., De Sa, C., Gu, A., & Ré, C. (2018). Representation tradeoffs for hyperbolic embeddings. In International Conference on Machine Learning (pp. 4457–4466).

    Google Scholar 

  94. Schnabel, T., Labutov, I., Mimno, D., & Joachims, T. (2015). Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 298–307).

    Google Scholar 

  95. Severyn, A., & Moschitti, A. (2013). Automatic feature engineering for answer selection and extraction. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 458–467).

    Google Scholar 

  96. Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 373–382). New York: ACM.

    Google Scholar 

  97. Shen, G., Yang, Y., & Deng, Z.-H. (2017). Inter-weighted alignment network for sentence pair modeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 1179–1189).

    Google Scholar 

  98. Shi, B., Lam, W., Jameel, S., Schockaert, S., & Lai, K. P. (2017). Jointly learning word embeddings and latent topics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’17 (pp. 375–384). New York: ACM.

    Chapter  Google Scholar 

  99. Steyvers, M., & Griffiths, T. (2007). Probabilistic topic models. In Handbook of latent semantic analysis (pp. 424–440).

    Google Scholar 

  100. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (pp. 3104–3112).

    Google Scholar 

  101. Tai, K. S., Socher, R., & Manning, C. D. (2015). Improved semantic representations from tree-structured long short-term memory networks. Preprint. arXiv:1503.00075.

    Google Scholar 

  102. Talmor, A., & Berant, J. (2018). The web as a knowledge-base for answering complex questions. Preprint. arXiv:1803.06643.

    Google Scholar 

  103. Tay, Y., Luu, A. T., & Hui, S. C. (2017). Enabling efficient question answer retrieval via hyperbolic neural networks. CoRR abs/1707.07847.

    Google Scholar 

  104. Tay, Y., Phan, M. C., Tuan, L. A., & Hui, S. C. (2017). Learning to rank question answer pairs with holographic dual LSTM architecture. Preprint. arXiv:1707.06372.

    Google Scholar 

  105. Tay, Y., Tuan, L. A., & Hui, S. C. (2018). Multi-cast attention networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2299–2308). New York: ACM.

    Chapter  Google Scholar 

  106. Tran, Q. H., Lai, T., Haffari, G., Zukerman, I., Bui, T., & Bui, H. (2018). The context-dependent additive recurrent neural net. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (vol. 1, pp. 1274–1283).

    Google Scholar 

  107. Trischler, A., Wang, T., Yuan, X., Harris, J., Sordoni, A., Bachman, P., & Suleman, K. (2016). NewsQA: A machine comprehension dataset. Preprint. arXiv:1611.09830.

    Google Scholar 

  108. Trischler, A., Wang, T., Yuan, X., Harris, J., Sordoni, A., Bachman, P., et al. (2017). NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP (pp. 191–200). Stroudsburg: Association for Computational Linguistics.

    Chapter  Google Scholar 

  109. Upadhyay, S., Chang, K. W., Taddy, M., Kalai, A., & Zou, J. (2017). Beyond bilingual: Multi-sense word embeddings using multilingual context. Preprint. arXiv:1706.08160.

    Google Scholar 

  110. Van Rijsbergen, C. J. (2004). The geometry of information retrieval. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  111. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998–6008).

    Google Scholar 

  112. Vilnis, L., & McCallum, A. (2014). Word representations via Gaussian embedding. Preprint. arXiv:1412.6623.

    Google Scholar 

  113. Vorontsov, K., Potapenko, A., & Plavin, A. (2015). Additive regularization of topic models for topic selection and sparse factorization. In International Symposium on Statistical Learning and Data Sciences (pp. 193–202). Berlin: Springer.

    Chapter  Google Scholar 

  114. Wang, B., Li, Q., Melucci, M., & Song, D. (2019). Semantic Hilbert space for text representation learning. Preprint. arXiv:1902.09802.

    Google Scholar 

  115. Wang, B., Niu, J., Ma, L., Zhang, Y., Zhang, L., Li, J., et al. (2016). A Chinese question answering approach integrating count-based and embedding-based features. In Natural Language Understanding and Intelligent Applications (pp. 934–941). Cham: Springer.

    Chapter  Google Scholar 

  116. Wang, D., & Nyberg, E. (2015). A long short-term memory model for answer sentence selection in question answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (vol. 2, pp. 707–712).

    Google Scholar 

  117. Wang, L., Yao, J., Tao, Y., Zhong, L., Liu, W., & Du, Q. (2018). A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. Preprint. arXiv:1805.03616.

    Google Scholar 

  118. Wang, M., & Manning, C. D. (2010). Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 1164–1172). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  119. Wang, M., Smith, N. A., & Mitamura, T. (2007). What is the jeopardy model? A quasi-synchronous grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).

    Google Scholar 

  120. Wang, Z., Hamza, W., & Florian, R. (2017). Bilateral multi-perspective matching for natural language sentences. Preprint. arXiv:1702.03814.

    Google Scholar 

  121. Wong, S. K. M., Ziarko, W., & Wong, P. C. N. (1985). Generalized vector spaces model in information retrieval. In Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’85 (pp. 18–25). New York: ACM.

    Chapter  Google Scholar 

  122. Yang, L., Ai, Q., Guo, J., & Croft, W. B. (2016). aNMM: Ranking short answer texts with attention-based neural matching model. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 287–296). New York: ACM.

    Google Scholar 

  123. Yang, Y., Yih, W.-T., & Meek, C. (2015). WikiQA: A challenge dataset for open-domain question answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 2013–2018).

    Google Scholar 

  124. Yao, X., Van Durme, B., Callison-Burch, C., & Clark, P. (2013). Answer extraction as sequence tagging with tree edit distance. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 858–867).

    Google Scholar 

  125. Yao, Z., Sun, Y., Ding, W., Rao, N., & Xiong, H. (2018). Dynamic word embeddings for evolving semantic discovery. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining - WSDM ’18 (pp. 673–681). New York: ACM.

    Chapter  Google Scholar 

  126. Yih, W. T., Chang, M. W., Meek, C., & Pastusiak, A. (2013). Question answering using enhanced lexical semantic models. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (vol. 1, pp. 1744–1753).

    Google Scholar 

  127. Yin, W., & Schütze, H. (2015). Learning meta-embeddings by using ensembles of embedding sets. Preprint. arXiv:1508.04257.

    Google Scholar 

  128. Yu, L., Hermann, K. M., Blunsom, P., & Pulman, S. (2014). Deep learning for answer sentence selection. Preprint. arXiv:1412.1632.

    Google Scholar 

  129. Zamani, H., & Croft, W. B. (2017). Relevance-based word embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR’17 (pp. 505–514). New York: ACM.

    Chapter  Google Scholar 

  130. Zhai, C., & Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. Transactions on Information Systems, 22(2), 179–214.

    Article  Google Scholar 

  131. Zhang, P., Niu, J., Su, Z., Wang, B., Ma, L., & Song, D. (2018). End-to-end quantum-like language models with application to question answering. In The Thirty-Second AAAI Conference on Artificial Intelligence. Menlo Park: Association for the Advancement of Artificial Intelligence.

    Google Scholar 

  132. Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems (pp. 649–657). Cambridge: MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benyou Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, B., Buccio, E.D., Melucci, M. (2019). Representing Words in Vector Space and Beyond. In: Aerts, D., Khrennikov, A., Melucci, M., Toni, B. (eds) Quantum-Like Models for Information Retrieval and Decision-Making. STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health. Springer, Cham. https://doi.org/10.1007/978-3-030-25913-6_5

Download citation

Publish with us

Policies and ethics