Skip to main content

An Analysis of Word2Vec for the Italian Language

  • Chapter
  • First Online:
Progresses in Artificial Intelligence and Neural Systems

Abstract

Word representation is fundamental in NLP tasks, because it is precisely from the coding of semantic closeness between words that it is possible to think of teaching a machine to understand text. Despite the spread of word embedding concepts, still few are the achievements in linguistic contexts other than English. In this work, analysing the semantic capacity of the Word2Vec algorithm, an embedding for the Italian language is produced. Parameter setting such as the number of epochs, the size of the context window and the number of negatively backpropagated samples is explored.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://laila.tech/.

References

  1. Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 932–938. MIT Press (2001)

    Google Scholar 

  2. Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)

    Article  Google Scholar 

  3. Firth, J.R.: A synopsis of linguistic theory 1930–55, pp. 1–32 (1952–59) (1957)

    Google Scholar 

  4. Almeida, F., Xexéo, G.: Word Embeddings: A Survey (2019)

    Google Scholar 

  5. Zhang, Y., Rahman, M.M., Braylan, A., Dang, B., Chang, H.-L., Kim, H., McNamara, Q., Angert, A., Banner, E., Khetan, V., McDonnell, T., Nguyen, A.T., Xu, D., Wallace, B.C., Lease, M.: Neural information retrieval: a literature review. In: CoRR (2016). arXiv:1611.06792

  6. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: CoRR (2013). arXiv:1301.3781

  7. Pennington, J., Socher, R., Manning, C.: Glove: Global Vectors for Word Representation, vol. 14, pp. 1532–1543 (2014)

    Google Scholar 

  8. Schnabel, T., Labutov, I., Mimno, D., Joachims, T.: Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 298–307. Association for Computational Linguistics (2015)

    Google Scholar 

  9. Bakarov, A.: A survey of word embeddings evaluation methods (2018)

    Google Scholar 

  10. Tripodi, R., Li Pira, S.: Analysis of Italian word embeddings (2017)

    Google Scholar 

  11. Berardi, G., Esuli, A., Marcheggiani, D.: Word embeddings go to Italy: a comparison of models and training datasets. In: CEUR Workshop Proceedings, vol. 1404 (2015)

    Google Scholar 

  12. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 10 (2013)

    Google Scholar 

  13. Levy, O., Goldberg, Y.: Linguistic regularities in sparse and explicit word representations. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp. 171–180, June 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giovanni Di Gennaro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Di Gennaro, G., Buonanno, A., Di Girolamo, A., Ospedale, A., Palmieri, F.A.N., Fedele, G. (2021). An Analysis of Word2Vec for the Italian Language. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Progresses in Artificial Intelligence and Neural Systems. Smart Innovation, Systems and Technologies, vol 184. Springer, Singapore. https://doi.org/10.1007/978-981-15-5093-5_13

Download citation

Publish with us

Policies and ethics