Language Processing and Intelligent Information Systems

Volume 7912 of the series Lecture Notes in Computer Science pp 112-124

Unsupervised Induction of Persian Semantic Verb Classes Based on Syntactic Information

  • Maryam AminianAffiliated withDepartment of Computer Engineering, Sharif University of Technology
  • , Mohammad Sadegh RasooliAffiliated withDepartment of Computer Science, Columbia University
  • , Hossein SametiAffiliated withDepartment of Computer Engineering, Sharif University of Technology

* Final gross prices may vary according to local VAT.

Get Access


Automatic induction of semantic verb classes is one of the most challenging tasks in computational lexical semantics with a wide variety of applications in natural language processing. The large number of Persian speakers and the lack of such semantic classes for Persian verbs have motivated us to use unsupervised algorithms for Persian verb clustering. In this paper, we have done experiments on inducing the semantic classes of Persian verbs based on Levin’s theory for verb classes. Syntactic information extracted from dependency trees is used as base features for clustering the verbs. Since there has been no manual classification of Persian verbs prior to this paper, we have prepared a manual classification of 265 verbs into 43 semantic classes. We show that spectral clustering algorithm outperforms KMeans and improves on the baseline algorithm with about 17% in Fmeasure and 0.13 in Rand index.