Estimating Distributed Representations of Compound Words Using Recurrent Neural Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10260)

Abstract

Distributed representations of words play a crucial role in many natural language processing tasks. However, to learn the distributed representations of words, each word in the text corpus is treated as an individual token. Therefore, the distributed representations of compound words could not be directly represented. In this paper, we introduce a recurrent neural network (RNN)-based approach for estimating distributed representations of compound words. The experimental results show that the RNN-based approach can estimate the distributed representations of compound words better than the average representation approach, which simply uses the average of individual word representations as an estimated representation of a compound word. Furthermore, the characteristic of estimated representations of compound words are closely similar to the actual representations of compound words.

References

  1. 1.
    Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATHGoogle Scholar
  2. 2.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994)CrossRefGoogle Scholar
  3. 3.
    Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS Workshop (2014)Google Scholar
  4. 4.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATHGoogle Scholar
  5. 5.
    Dima, C., Hinrichs, E.: Automatic noun compound interpretation using deep neural networks and word embeddings. In: IWCS, p. 173 (2015)Google Scholar
  6. 6.
    Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)CrossRefGoogle Scholar
  7. 7.
    Garten, J., Sagae, K., Ustun, V., Dehghani, M.: Combining distributed vector representations for words. In: NAACL-HLT, pp. 95–101 (2015)Google Scholar
  8. 8.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)CrossRefGoogle Scholar
  9. 9.
    Kertkeidkachorn, N., Ichise, R.: T2KG: an end-to-end system for creating knowledge graph from unstructured text. In: AAAI Technical Report. AAAI Press (2017)Google Scholar
  10. 10.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  11. 11.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. ICML 14, 1188–1196 (2014)Google Scholar
  12. 12.
    Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 19, 2579–2605 (2008)MATHGoogle Scholar
  13. 13.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR (2013)Google Scholar
  14. 14.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  15. 15.
    Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. NAACL-HLT 13, 746–751 (2013)Google Scholar
  16. 16.
    Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–1543 (2014)Google Scholar
  17. 17.
    Shimaoka, S., Stenetorp, P., Inui, K., Riedel, S.: An attentive neural architecture for fine-grained entity type classification. In: AKBC (2016)Google Scholar
  18. 18.
    Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: ACL, pp. 455–465 (2013)Google Scholar
  19. 19.
    Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, vol. 1631, p. 1642 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.SOKENDAI (The Graduate University for Advanced Studies)TokyoJapan
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations