Skip to main content

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 393))

Abstract

In data dominated systems and applications, a concept of representing words in a numerical format has gained a lot of attention. There are a few approaches used to generate such a representation. An interesting issue that should be considered is the ability of such representations—called embeddings—to imitate human-based semantic similarity between words. In this study, we perform a fuzzy-based analysis of vector representations of words, i.e., word embeddings. We use two popular fuzzy clustering algorithms on count-based word embeddings, known as GloVe, of different dimensionality. Words from WordSim-353, called the gold standard, are represented as vectors and clustered. The results indicate that fuzzy clustering algorithms are very sensitive to high-dimensional data, and parameter tuning can dramatically change their performance. We show that by adjusting the value of the fuzzifier parameter, fuzzy clustering can be successfully applied to vectors of high—up to one hundred—dimensions. Additionally, we illustrate that fuzzy clustering allows to provide interesting results regarding membership of words to different clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Pre-trained 400,000 GloVe vectors available in: https://nlp.stanford.edu/projects/glove/.

References

  1. J. Pennington, R. Socher, C. Manning, Glove: global vectors for word representation, in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2014), pp. 1532–1543

    Google Scholar 

  2. A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  3. R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification. (Wiley, 2012)

    Google Scholar 

  4. N. Dhanachandra, K. Manglem, Y.J. Chanu, Image segmentation using K-means clustering algorithm and subtractive clustering algorithm. Procedia Comput. Sci. 54, 764–771 (2015)

    Article  Google Scholar 

  5. L.V. Bijuraj, Clustering and its application, in Proceedings of National Conference on New Horizons in IT-NCNHIT, (2013) p. 169

    Google Scholar 

  6. L. Zadeh, Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  Google Scholar 

  7. J.C. Bezdek, Pattern recognition with Fuzzy objective Function, Algorithms. (1981)

    Google Scholar 

  8. J.V. De Oliveira, W. Pedrycz (eds.), Advances in Fuzzy Clustering and its Applications, (Wiley, 2007)

    Google Scholar 

  9. D.E. Gustafson, W.C. Kessel, Fuzzy clustering with a fuzzy covariance matrix, in 1978 IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes, (January 1979.), pp. 761–766, IEEE

    Google Scholar 

  10. X.L. Xie, G. Beni, A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)

    Article  Google Scholar 

  11. Z. Chen, Y. Huang, Y. Liang, Y. Wang, X. Fu, K. Fu, RGloVe: an improved approach of global vectors for distributional entity relation representation. Algorithms 10(2), 42 (2017)

    Article  Google Scholar 

  12. L. Finkelstein et al., Placing search in context: The concept revisited, in Proceedings of the 10th International Conference on World Wide Web, (ACM, April 2001), pp. 406–414

    Google Scholar 

  13. L.V.D. Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 2579–2605 (2008)

    Google Scholar 

  14. R. Winkler, F. Klawonn, R. Kruse, Fuzzy c-means in high dimensional spaces. Int. J. Fuzzy Syst. Appl. 1, 1–16 (2013)

    Google Scholar 

Download references

Acknowledgements

The authors express their gratitude to the  Ministry of Education of the Republic of Azerbaijan for funding this research under the “State Program on Education of Azerbaijani Youth Abroad in the Years of 2007-2015” program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shahin Atakishiyev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Atakishiyev, S., Reformat, M.Z. (2021). Analysis of Word Embeddings Using Fuzzy Clustering. In: Shahbazova, S.N., Kacprzyk, J., Balas, V.E., Kreinovich, V. (eds) Recent Developments and the New Direction in Soft-Computing Foundations and Applications. Studies in Fuzziness and Soft Computing, vol 393. Springer, Cham. https://doi.org/10.1007/978-3-030-47124-8_44

Download citation

Publish with us

Policies and ethics