Vec2graph: A Python Library for Visualizing Word Embeddings as Graphs

Katricheva, Nadezda; Yaskevich, Alyaxey; Lisitsina, Anastasiya; Zhordaniya, Tamara; Kutuzov, Andrey; Kuzmenko, Elizaveta

doi:10.1007/978-3-030-39575-9_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1086))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

907 Accesses
1 Citations

Abstract

Visualization as a means of easy conveyance of ideas plays a key role in communicating linguistic theory through its applications. User-friendly NLP visualization tools allow researchers to get important insights for building, challenging, proving or rejecting their hypotheses. At the same time, visualizations provide general public with some understanding of what computational linguists investigate.

In this paper, we present vec2graph: a ready-to-use Python 3 library visualizing vector representations (for example, word embeddings) as dynamic and interactive graphs. It is aimed at users with beginners’ knowledge of software development, and can be used to easily produce visualizations suitable for the Web. We describe key ideas behind vec2graph, its hyperparameters, and its integration into existing word embedding frameworks.

N. Katricheva and A. Yaskevich—Contributed equally to the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All English word embedding models used were downloaded from the NLPL Vectors repository [4].
2.
All Russian word embeddings used were downloaded from RusVectores service [9].
3.
https://projector.tensorflow.org/.
4.
https://github.com/anvaka/word2vec-graph.

References

Belinkov, Y., Glass, J.: Analysis methods in neural language processing: a survey. Trans. Assoc. Comput. Linguist. 7, 49–72 (2019). https://doi.org/10.1162/tacl_a_00254
Article Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
MATH Google Scholar
Bostock, M., Ogievetsky, V., Heer, J.: D-3: data-driven documents. IEEE Trans. Vis. Comput. Graph. 17, 2301–9 (2011). https://doi.org/10.1109/TVCG.2011.185
Article Google Scholar
Fares, M., Kutuzov, A., Oepen, S., Velldal, E.: Word vectors, reuse, and replicability: towards a community repository of large-text resources. In: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22–24 May 2017, Gothenburg, Sweden, pp. 271–276. Linköping University Electronic Press, Linköpings universitet (2017)
Google Scholar
Hamilton, W., Clark, K., Leskovec, J., Jurafsky, D.: Inducing domain-specific sentiment lexicons from unlabeled corpora. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 595–605. Association for Computational Linguistics, Austin, November 2016. https://doi.org/10.18653/v1/D16-1057. https://www.aclweb.org/anthology/D16-1057
Healy, K.: Data Visualization: A Practical Introduction. Princeton University Press, Princeton (2018)
Google Scholar
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)
Article Google Scholar
Jolliffe, I.T., Cadima, J.: Principal component analysis: a review and recent developments. Philos. Trans. Royal Soc. A: Math. Phys. Eng. Sci. 374(2065), 20150202 (2016)
Article MathSciNet Google Scholar
Kutuzov, A., Kuzmenko, E.: WebVectors: a toolkit for building web interfaces for vector semantic models. In: Ignatov, D.I., et al. (eds.) AIST 2016. CCIS, vol. 661, pp. 155–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_15
Chapter Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
MATH Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Miller, G.A.: WordNet: A lexical database for English. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748. http://doi.acm.org/10.1145/219717.219748
Article Google Scholar
Navigli, R., Paolo Ponzetto, S.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). https://doi.org/10.1016/j.artint.2012.07.001
Article MathSciNet MATH Google Scholar
Pearson, K.: On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)
Article Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, Valletta, Malta, pp. 45–50, May 2010
Google Scholar
Verlet, L.: Computer experiments on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159, 98–103 (1967). https://doi.org/10.1103/PhysRev.159.98. https://link.aps.org/doi/10.1103/PhysRev.159.98
Article Google Scholar
Wattenberg, M., Viégas, F., Johnson, I.: How to use t-SNE effectively. Distill 1(10), e2 (2016)
Article Google Scholar
Wildgen, W.: From Lullus to cognitive semantics: the evolution of a theory of semantic fields. In: Proceedings of the Twentieth World Congress of Philosophy. University of Bremen (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

National Research University Higher School of Economics, Moscow, Russia
Nadezda Katricheva, Alyaxey Yaskevich, Anastasiya Lisitsina & Tamara Zhordaniya
University of Oslo, Oslo, Norway
Andrey Kutuzov
University of Trento, Trento, Italy
Elizaveta Kuzmenko

Authors

Nadezda Katricheva
View author publications
You can also search for this author in PubMed Google Scholar
Alyaxey Yaskevich
View author publications
You can also search for this author in PubMed Google Scholar
Anastasiya Lisitsina
View author publications
You can also search for this author in PubMed Google Scholar
Tamara Zhordaniya
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Kutuzov
View author publications
You can also search for this author in PubMed Google Scholar
Elizaveta Kuzmenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nadezda Katricheva .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Institute of Mathematics and Mechanics Yekaterinburg, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, Moscow, Russia
Valentina Kuskova
University of Oslo, Oslo, Norway
Andrey Kutuzov
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
Loria, Vandoeuvre lès Nancy, France
Amedeo Napoli
University of Florida, Gainesville, USA
Panos M. Pardalos
Ca' Foscari University of Venice, Venezia Mestre, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Kazan Federal University, Kazan, Russia
Elena Tutubalina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Katricheva, N., Yaskevich, A., Lisitsina, A., Zhordaniya, T., Kutuzov, A., Kuzmenko, E. (2020). Vec2graph: A Python Library for Visualizing Word Embeddings as Graphs. In: van der Aalst, W., et al. Analysis of Images, Social Networks and Texts. AIST 2019. Communications in Computer and Information Science, vol 1086. Springer, Cham. https://doi.org/10.1007/978-3-030-39575-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-39575-9_20
Published: 02 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39574-2
Online ISBN: 978-3-030-39575-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics