Detecting Gender by Full Name: Experiments with the Russian Language

Conference paper

DOI: 10.1007/978-3-319-12580-0_17

Part of the Communications in Computer and Information Science book series (CCIS, volume 436)
Cite this paper as:
Panchenko A., Teterin A. (2014) Detecting Gender by Full Name: Experiments with the Russian Language. In: Ignatov D., Khachay M., Panchenko A., Konstantinova N., Yavorsky R. (eds) Analysis of Images, Social Networks and Texts. AIST 2014. Communications in Computer and Information Science, vol 436. Springer, Cham

Abstract

This paper describes a method that detects gender of a person by his/her full name. While some approaches were proposed for English language, little has been done so far for Russian. We fill this gap and present a large-scale experiment on a dataset of 100,000 Russian full names from Facebook. Our method is based on three types of features (word endings, character \(n\)-grams and dictionary of names) combined within a linear supervised model. Experiments show that the proposed simple and computationally efficient approach yields excellent results achieving accuracy up to 96 %.

Keywords

Gender detection Short text classification 

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Digital Society Laboratory LLCMoscowRussia
  2. 2.Université catholique de LouvainLouvain-la-NeuveBelgium

Personalised recommendations