Unsupervised Induction of Persian Semantic Verb Classes Based on Syntactic Information

  • Maryam Aminian
  • Mohammad Sadegh Rasooli
  • Hossein Sameti
Conference paper

DOI: 10.1007/978-3-642-38634-3_13

Volume 7912 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Aminian M., Rasooli M.S., Sameti H. (2013) Unsupervised Induction of Persian Semantic Verb Classes Based on Syntactic Information. In: Kłopotek M.A., Koronacki J., Marciniak M., Mykowiecka A., Wierzchoń S.T. (eds) Language Processing and Intelligent Information Systems. Lecture Notes in Computer Science, vol 7912. Springer, Berlin, Heidelberg

Abstract

Automatic induction of semantic verb classes is one of the most challenging tasks in computational lexical semantics with a wide variety of applications in natural language processing. The large number of Persian speakers and the lack of such semantic classes for Persian verbs have motivated us to use unsupervised algorithms for Persian verb clustering. In this paper, we have done experiments on inducing the semantic classes of Persian verbs based on Levin’s theory for verb classes. Syntactic information extracted from dependency trees is used as base features for clustering the verbs. Since there has been no manual classification of Persian verbs prior to this paper, we have prepared a manual classification of 265 verbs into 43 semantic classes. We show that spectral clustering algorithm outperforms KMeans and improves on the baseline algorithm with about 17% in Fmeasure and 0.13 in Rand index.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Maryam Aminian
    • 1
  • Mohammad Sadegh Rasooli
    • 2
  • Hossein Sameti
    • 1
  1. 1.Department of Computer EngineeringSharif University of TechnologyTehranIran
  2. 2.Department of Computer ScienceColumbia UniversityNew YorkUSA