Advertisement

Predicting User Age by Keystroke Dynamics

  • Avar Pentel
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 764)

Abstract

Keystroke dynamics is investigated over 30 years because of its biometric properties, but most of the studies are focusing on identification. In current study our goal is to predict user age by keystroke data. We collected keystroke data through different real life online systems during 2011 and 2018. Data logs were labeled with user age, gender and in some cases with other available information. We analyzed 2.3 million keystrokes, from 7119 keystroke data logs, produced by ca 1000 individual subjects, presenting six different age groups. All these data logs are also made available to research community, and the web address is provided in the paper. We carried out binary and multiclass classification using supervised machine-learning methods. Binary classification results were all over the baseline, best f-score over 0.92 and lowest 0.82. Multiclass classification distinguished all groups over baseline. Analyzing distinguishing features, we found overlap with text-mining features from previous studies.

Keywords

Keystroke dynamics Age prediction Multiclass classification Supervised machine learning 

References

  1. 1.
    Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9) (2013). Ed. Tobias PreisCrossRefGoogle Scholar
  2. 2.
    Zhang, J., et al.: Your age is no secret: Inferring microbloggers’ ages via content and interaction analysis. In: Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016, p. 476–485. AAAI Press (2016)Google Scholar
  3. 3.
    Rosenthal, S., McKeown, K.: Age prediction in blogs: a study of style, content, and online behavior in pre - and post - social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (2011)Google Scholar
  4. 4.
    Morgan-Lopez, A.A., et al.: Predicting age groups of Twitter users based on language and metadata features. PLoS ONE 12(8) (2017)CrossRefGoogle Scholar
  5. 5.
    Flekova, L., Preotiuc-Pietro, D., Ungar, L.H.: Exploring stylistic variation with age and income on Twitter. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 2 (2016)Google Scholar
  6. 6.
    Pentel, A.: Effect of different feature types on age based classification of short texts. In: Proceedings of 6th International Conference on Information, Intelligence, Systems and Applications. IEEE Digital Library (2015)Google Scholar
  7. 7.
    Garcia, J.: Personal identification apparatus. US Patent Office, 4621334 (1986)Google Scholar
  8. 8.
    Monrose, F., Rubin, A.D.: Keystroke dynamics as a biometric for authentication. Future Gener. Comput. Syst. 16, 351–359 (2000)CrossRefGoogle Scholar
  9. 9.
    Vizer, L.M.: Detecting cognitive and physical stress through typing behavior. In: Proceedings CHI EA 2009 CHI 2009 Extended Abstracts on Human Factors in Computing Systems, pp. 3113–3116 (2009)Google Scholar
  10. 10.
    Pentel, A.: Emotions and user interactions with Keyboard and Mouse. In: Proceedings of 8th International Conference on Information, Intelligence, Systems and Applications. IEEE Digital library (2017)Google Scholar
  11. 11.
    Fairhurst, M., Costa-Abreu, M.D.: Using keystroke dynamics for gender identification in social network environment. In: 4th International Conference on Imaging for Crime Detection and Prevention (ICDP) 2011Google Scholar
  12. 12.
    Pentel, A.: Predicting age and gender by keystroke dynamics and mouse patterns. In: UMAP 2017 Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization. pp 381–385. ACM Digital Library (2017)Google Scholar
  13. 13.
    Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall, New York (1997)CrossRefGoogle Scholar
  14. 14.
    Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explor. 11, 1 (2009)CrossRefGoogle Scholar
  15. 15.
    Pentel, A.: High precision handedness detection based on short input keystroke dynamics. In: Proceedings of 8th International Conference on Information, Intelligence, Systems and Applications. IEEE Digital library (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Digital TechnologiesTallinn UniversityTallinnEstonia

Personalised recommendations