Comparing Sentence-Level Features for Authorship Analysis in Portuguese
In this paper we compare the robustness of several types of stylistic markers to help discriminate authorship at sentence level. We train a SVM-based classifier using each set of features separately and perform sentence-level authorship analysis over corpus of editorials published in a Portuguese quality newspaper. Results show that features based on POS information, punctuation and word / sentence length contribute to a more robust sentence-level authorship analysis.
KeywordsSupport Vector Machine Discriminant Function Analysis Sentence Length Test Sentence Discriminant Function Analysis
Unable to display preview. Download preview PDF.
- 1.Eagleson, R.: Forensic analysis of personal written texts: a case study. In: Gibbons, J. (ed.) Forensic Linguistics: An Introduction to Language in the Justice System, pp. 362–373. Longman, Harlow (1994)Google Scholar
- 2.Grant, T.: Quantifying evidence in forensic authorship analysis. The International Journal of Speech, Language and the Law 14(1), 1–25 (2007)Google Scholar