Random Indexing Distributional Semantic Models for Croatian Language

  • Vedrana Janković
  • Jan Šnajder
  • Bojana Dalbelo Bašić
Conference paper

DOI: 10.1007/978-3-642-23538-2_52

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6836)
Cite this paper as:
Janković V., Šnajder J., Dalbelo Bašić B. (2011) Random Indexing Distributional Semantic Models for Croatian Language. In: Habernal I., Matoušek V. (eds) Text, Speech and Dialogue. TSD 2011. Lecture Notes in Computer Science, vol 6836. Springer, Berlin, Heidelberg

Abstract

Distributional semantic models (DSMs) model semantic relations between expressions by comparing the contexts in which these expressions occur. This paper presents an extensive evaluation of distributional semantic models for Croatian language. We focus on random indexing models, an efficient and scalable approach to building DSMs. We build a number of models with different parameters (dimension, context type, and similarity measure) and compare them against human semantic similarity judgments. Our results indicate that even low-dimensional random indexing models may outperform the raw frequency models, and that the choice of the similarity measure is most important.

Keywords

Distributional semantic model computational semantics random indexing Croatian language 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Vedrana Janković
    • 1
  • Jan Šnajder
    • 1
  • Bojana Dalbelo Bašić
    • 1
  1. 1.Faculty of Electrical Engineering and ComputingUniversity of ZagrebCroatia

Personalised recommendations