Syntactic Dependency-Based N-grams: More Evidence of Usefulness in Classification

  • Grigori Sidorov
  • Francisco Velasquez
  • Efstathios Stamatatos
  • Alexander Gelbukh
  • Liliana Chanona-Hernández
Conference paper

DOI: 10.1007/978-3-642-37247-6_2

Volume 7816 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Sidorov G., Velasquez F., Stamatatos E., Gelbukh A., Chanona-Hernández L. (2013) Syntactic Dependency-Based N-grams: More Evidence of Usefulness in Classification. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7816. Springer, Berlin, Heidelberg

Abstract

The paper introduces and discusses a concept of syntactic n-grams (sn-grams) that can be applied instead of traditional n-grams in many NLP tasks. Sn-grams are constructed by following paths in syntactic trees, so sn-grams allow bringing syntactic knowledge into machine learning methods. Still, previous parsing is necessary for their construction. We applied sn-grams in the task of authorship attribution for corpora of three and seven authors with very promising results.

Keywords

Syntactic n-grams sn-grams syntactic paths authorship attribution task SVM classifier 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Grigori Sidorov
    • 1
  • Francisco Velasquez
    • 1
  • Efstathios Stamatatos
    • 2
  • Alexander Gelbukh
    • 1
  • Liliana Chanona-Hernández
    • 3
  1. 1.Center for Computing Research (CIC)Instituto Politécnico Nacional (IPN)Mexico CityMexico
  2. 2.University of the AegeanGreece
  3. 3.ESIMEInstituto Politécnico Nacional (IPN)Mexico CityMexico