Estimating the Similarities Between Texts of Right-Handed and Left-Handed Males and Females
Identifying the characteristics of text authors is of critical importance for marketing, security, etc., and there has been a growing interest in this issue recently. A major feature to be researched using text analysis has been gender. Despite a lot of studies that have obviously contributed to the progress in the field, identification of gender with text authors so far remains challenging and daunting. One of the reasons is that current research shows no consideration of the mutual influences of various individual characteristics including gender and laterality. In this paper, using the material of a specially designed corpus of Russian texts named RusNeuroPsych, including the neuropsychological data of the authors, we calculated the distance between texts written by right-handed and left-handed males and females (4 classes). For this study we have chosen handedness as one of the most important laterality measures. In order to calculate the distance between the classes, a formula measuring the Wave-Hedges distance was employed. The text parameters were topic-independent and frequent (the indices of lexical diversity, a variety of parts of speech ratios, etc.). It was shown that texts by authors of different genders but with an identical type of handedness are more similar linguistically than those by individuals of the same gender but with a different type of manual preference. We suppose that it could be useful to build a classifier for classes “gender + handedness” instead of predicting gender itself.
KeywordsGender attribution Authorship profiling Handedness Laterality Distance measures Russian language
This research is financially supported by the Russian Science Foundation, project No. 16-18-10050, “Identifying the Gender and Age of Online Chatters Using Formal Parameters of their Texts”.
- 2.Company, J., Wanner, L.: How to use less features and reach better performance in author gender identification. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC), pp. 1315–1319. Reykjavik, Iceland (2014)Google Scholar
- 3.Khomskaia, E.D., Efimova, I.V., Budyka, Ye.V., Yenikolopova, Ye.V.: Neyropsikhologiya individual’nykh razlichiy [Neuropsychology of Individual Differences]. Russian Pedagogical Agency, Moscow (1997) (in Russian)Google Scholar
- 6.Litvinova, T., Ryzhkova, E., Litvinova, O.: Features a written speech production of people with different profiles of the lateral brain organization (on the Basis of a New Type RusNeuroPsych Corpus). In: Proceedings of the 7th Tutorial and Research Workshop on Experimental Linguistics (ExLing 2016), pp. 103–107. International Speech Communication Association, Saint Petersburg, Russia (2016)Google Scholar
- 7.Mikros, G.: Systematic stylometric differences in men and women authors: a corpus-based study. In: Köhler, R. Altmann, G. (eds.) Issues in Quantitative Linguistics 3. Dedicated to Karl-Heinz Best on the Occasion of His 70th Birthday, pp. 206–223. RAM – Verlag (2013)Google Scholar
- 12.Pennebaker, J., Booth, R., Boyd, R., Francis, M.: Linguistic Inquiry and Word Count: LIWC2015. Pennebaker Conglomerates, Austin (2015)Google Scholar
- 13.Rangel, F., Celli, F., Rosso, P., Potthast, M., Stein, B., Daelemans, W.: Overview of the 3rd author profiling task at PAN 2015. CLEF 2015 Labs and Workshops, Notebook Papers, CEUR-WS.org. Toulouse, France (2015)Google Scholar
- 14.Rangel, F., Rosso, P., Verhoeven, B., Daelemans, W., Potthast, M., Stein, B.: Overview of the 4th author profiling task at PAN 2016: Cross-Genre Evaluations. Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR-WS.org. Évora, Portugal (2016)Google Scholar
- 15.Sarawgi, R., Gajulapalli, K., Choi, Y.: Gender attribution: tracing stylometric evidence beyond topic and genre. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pp. 78–86. Association for Computational Linguistics, Portland (2011)Google Scholar
- 17.Schler, J., Koppel, M., Argamon, S., Pennebaker, J.W.: Effects of age and gender on blogging. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, vol. 6, pp. 199–205 (2006)Google Scholar