Skip to main content

An Unsupervised Character-Aware Neural Approach to Word and Context Representation Learning

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2018 (ICANN 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11141))

Included in the following conference series:

Abstract

In the last few years, neural networks have been intensively used to develop meaningful distributed representations of words and contexts around them. When these representations, also known as “embeddings”, are learned from unsupervised large corpora, they can be transferred to different tasks with positive effects in terms of performances, especially when only a few supervisions are available. In this work, we further extend this concept, and we present an unsupervised neural architecture that jointly learns word and context embeddings, processing words as sequences of characters. This allows our model to spot the regularities that are due to the word morphology, and to avoid the need of a fixed-sized input vocabulary of words. We show that we can learn compact encoders that, despite the relatively small number of parameters, reach high-level performances in downstream tasks, comparing them with related state-of-the-art approaches or with fully supervised methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://wacky.sslmit.unibo.it/doku.php?id=corpora

References

  1. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    Google Scholar 

  2. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)

    Article  Google Scholar 

  3. Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: AISTATS, pp. 297–304 (2010)

    Google Scholar 

  4. Hinton, G.E., Mcclelland, J.L., Rumelhart, D.E.: Distributed Representations, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: foundations (1986)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  7. Hwang, K., Sung, W.: Character-level language modeling with hierarchical recurrent neural networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5720–5724. IEEE (2017)

    Google Scholar 

  8. Iacobacci, I., Pilehvar, M.T., Navigli, R.: Embeddings for word sense disambiguation: an evaluation study. In: ACL (Volume 1: Long Papers), pp. 897–907 (2016)

    Google Scholar 

  9. Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410 (2016)

  10. Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: AAAI, pp. 2741–2749 (2016)

    Google Scholar 

  11. Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: EMNLP, pp. 1520–1530 (2015)

    Google Scholar 

  12. Melamud, O., Goldberger, J., Dagan, I.: context2vec: learning generic context embedding with bidirectional LSTM. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 51–61 (2016)

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  14. Miyamoto, Y., Cho, K.: Gated word-character recurrent language model. In: Proceedings of the 2016 Conference on EMNLP, pp. 1992–1997 (2016)

    Google Scholar 

  15. Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: EACL (2017)

    Google Scholar 

  16. Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: ICML, pp. 1818–1826 (2014)

    Google Scholar 

  17. Santos, C.N.d., Guimaraes, V.: Boosting named entity recognition with neural character embeddings. arXiv preprint arXiv:1505.05008 (2015)

  18. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  19. Zhong, Z., Ng, H.T.: It makes sense: a wide-coverage word sense disambiguation system for free text. In: ACL, pp. 78–83 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppe Marra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Marra, G., Zugarini, A., Melacci, S., Maggini, M. (2018). An Unsupervised Character-Aware Neural Approach to Word and Context Representation Learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01424-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01423-0

  • Online ISBN: 978-3-030-01424-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics