Skip to main content

Interpreting Word Embeddings Using a Distribution Agnostic Approach Employing Hellinger Distance

  • Conference paper
  • First Online:
  • 1350 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12284))

Abstract

Word embeddings can encode semantic and syntactic features and have achieved many recent successes in solving NLP tasks. Despite their successes, it is not trivial to directly extract lexical information out of them. In this paper, we propose a transformation of the embedding space to a more interpretable one using the Hellinger distance. We additionally suggest a distribution-agnostic approach using Kernel Density Estimation. A method is introduced to measure the interpretability of the word embeddings. Our results suggest that Hellinger based calculation gives a  1.35% improvement on average over the Bhattacharyya distance in terms of interpretability and adapts better to unknown words.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/ficstamas/word_embedding_interpretability.

References

  1. Alishahi, A., Barking, M., Chrupała, G.: Encoding of phonology in a recurrent neural model of grounded speech. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 368–378 (2017)

    Google Scholar 

  2. Arora, S., May, A., Zhang, J., Ré, C.: Contextual embeddings: when are they worth it? arXiv preprint arXiv:2005.09117 (2020)

  3. Chen, Y., Perozzi, B., Al-Rfou, R., Skiena, S.: The expressive power of word embeddings (2013)

    Google Scholar 

  4. Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning (2017)

    Google Scholar 

  5. Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: Proceedings of NAACL (2015)

    Google Scholar 

  6. Hwang, J.N., Lay, S.R., Lippman, A.: Nonparametric multivariate density estimation: a comparative study. Trans. Sig. Proc. 42(10), 2795–2810 (1994)

    Article  Google Scholar 

  7. Lebret, R., Collobert, R.: Word embeddings through hellinger PCA. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (2014)

    Google Scholar 

  8. McRae, K., Cree, G., Seidenberg, M., Mcnorgan, C.: Semantic feature production norms for a large set of living and nonliving things. Behav. Res. Methods 37, 547–59 (2005)

    Article  Google Scholar 

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality (2013)

    Google Scholar 

  10. Murdoch, W.J., Singh, C., Kumbier, K., Abbasi-Asl, R., Yu, B.: Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. 116(44), 22071–22080 (2019)

    Article  MathSciNet  Google Scholar 

  11. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  12. Senel, L.K., Utlu, I., Yucesoy, V., Koc, A., Cukur, T.: Semantic structure and interpretability of word embeddings. IEEE/ACM Trans. Audio Speech Lang. Proc. 26(10), 1769–1779 (2018)

    Article  Google Scholar 

  13. Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge (2016)

    Google Scholar 

  14. Taleb, N.N.: The Black Swan: The Impact of the Highly Improbable, 1st edn. Random House, London (2008)

    Google Scholar 

  15. Turian, J., Ratinov, L.A., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394 (2010)

    Google Scholar 

  16. Yin, P., Zhou, C., He, J., Neubig, G.: StructVAE: tree-structured latent variable models for semi-supervised semantic parsing. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 754–765 (2018)

    Google Scholar 

Download references

Acknowledgements

This research was supported by the European Union and co-funded by the European Social Fund through the project “Integrated program for training new generation of scientists in the fields of computer science” (EFOP-3.6.3-VEKOP-16-2017-0002) and by the National Research, Development and Innovation Office of Hungary through the Artificial Intelligence National Excellence Program (2018-1.2.1-NKP-2018-00008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tamás Ficsor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ficsor, T., Berend, G. (2020). Interpreting Word Embeddings Using a Distribution Agnostic Approach Employing Hellinger Distance. In: Sojka, P., Kopeček, I., Pala, K., Horák, A. (eds) Text, Speech, and Dialogue. TSD 2020. Lecture Notes in Computer Science(), vol 12284. Springer, Cham. https://doi.org/10.1007/978-3-030-58323-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58323-1_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58322-4

  • Online ISBN: 978-3-030-58323-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics