Skip to main content
Log in

DLP-personality detection: a text-based personality detection framework with psycholinguistic features and pre-trained features

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Text-based personality detection aims to identify the personality traits implied in subject-supplied textual data. However, over-reliance on pre-trained language models and neglect of psycholinguistic features has become a bottleneck in personality detection. In this work, we conduct extensive feature-level ablation experiments using multiple psycholinguistic features to verify the importance of psycholinguistic features for personality detection. Furthermore, we propose a novel personality detection framework, DLP-Personality Detection, which fuses multiple psycholinguistic features and pre-trained language features. With the DLP-Personality Detection, we achieve state-of-the-art performance for the Big Five personality traits (Big 5) and Myers-Briggs Type Indicator (MBTI) personality traits on the Essays and Kaggle MBTI datasets. Finally, we provide some suggestions for psycholinguistic features and discuss future work for personality detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from https://github.com/ml-papers-coders/Keras-BigFive-personality-traits/blog/ and https://www.kaggle.com/datasets/datanaek/mbit-type.

Notes

  1. pypi.org/project/readability/

References

  1. Alessandro V, Gelareh M (2014) A survey of personality computing. IEEE Trans Affect Comput 5(3):273–291. https://doi.org/10.1109/TAFFC.2014.2330816

    Article  Google Scholar 

  2. Shumanov M, Johnson L (2021) Making conversations with chatbots more personalized. Comput Hum Behav 117:106627. https://doi.org/10.1016/j.chb.2020.106627

    Article  Google Scholar 

  3. Aguiar JJB, Fechine JM, Costa EB (2020) Collaborative filtering strategy for product recommendation using personality characteristics of customers. In: Proceedings of the brazilian symposium on multimedia and the web. Association for computational linguistics, pp 157-164. https://doi.org/10.1145/3428658.3430969

  4. Majaluoma S, Seppala T, Kautiainen H, Korhonen P (2020) Type D personality and metabolic syndrome among Finnish female municipal workers. BMC Womens Health 20(1):202. https://doi.org/10.1186/s12905-020-01052-z

    Article  Google Scholar 

  5. Kazameini A, Fatehi S, Mehta Y, Eetemadi S, Cambria B (2020) Personality trait detection using bagged svm over bert word embedding ensembles. In: The ACL 2020 workshop on Widening NLP. Association for computational linguistics

  6. Jiang H, Zhang XZ, Choi DJ (2020) Automatic text-based personality recognition on monologues and multiparty dialogues using attentive networks and contextual embeddings. In: Proceedings of the AAAI conference on artificial intelligence (Student Abstract). Association for the advancement of artificial intelligence, pp 13821-13822. https://doi.org/10.1609/aaai.v34i10.7182

  7. Mehta Y, Fatehi S, Kazameini A, Stachl C, Cambria E, Eetemadi S (2020) Bottom-up and top-down: predicting personality with psychopsycholinguistic and language model features. In: Proceedings of 2020 IEEE international conference on data mining. IEEE, pp 1184-1189. https://doi.org/10.1109/ICDM50108.2020.00146

  8. Zhu H, Li L, Jiang H (2018) Inferring personality traits from user liked images via weakly supervised dual convolutional network. In: The joint workshop of the 4th workshop on affective social multimedia computing and first multi-modal affective computing of large-scale multimedia data. Association for computing machinery, pp 63-69. https://doi.org/10.1145/3267935.3267953

  9. Zen G, Lepri E, Ricci E, Lanz O, Bruno F, Fbkirst K (2020) Space speaks: towards socially and personality aware visual surveillance. In: ACM Int’l workshop on multimodal pervasive video analysis. Association for computing machinery, 2020, pp 37-42. https://doi.org/10.1145/1878039.1878048

  10. Quercia D, Kosinski M, Stillwell D, Crowcroft J (2011) Our twitter profiles, our selves: Predicting personality with twitter. In: Proceedings of the 3rd international conference on privacy, security, risk and trust and the 3rd international conference on social computing,2011, pp 180-185. https://doi.org/10.1109/PASSAT/SocialCom.2011.26

  11. Li W, Hu X, Long X, Tang L, Chen J, Wang F, Zhang D (2020) EEG responses to emotional videos can quantitatively predict big-five personality traits. Neurocomputing 415:368–381. https://doi.org/10.1016/j.neucom.2020.07.123

    Article  Google Scholar 

  12. Wang Y, Zheng J, Li Q, Wang C, Zhang H, Gong J (2021) Xlnet-caps: personality classification from textual posts. Electronics 10(11):1360. https://doi.org/10.3390/electronics10111360

    Article  Google Scholar 

  13. Tausczik Y, Pennebaker J (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54

    Article  Google Scholar 

  14. Stajner S, Yenikent S. (2020) A survey of automatic personality detection from texts. In: Proceedings of the 28th international conference on computational linguistics. Association for computational linguistics, pp 6284-6295. https://doi.org/10.18653/v1/2020.coling-main.553

  15. Mairesse F, Walker M, Mehl M, Moore R (2007) Using psycholinguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500. https://doi.org/10.1613/jair.2349

    Article  Google Scholar 

  16. Argamon S, Koppel DSM, Pennebaker J (2005) Lexical predictors of personality type. Proceedings of the joint annual meeting of the interface and the classification society of north america 2005:1–16

    Google Scholar 

  17. Nguyen T, Phung D, Adams B, Venkatesh S (2011) Towards discovery of influence and personality traits through social link prediction. In: Proceedings of the international AAAI conference on web and social media. Association for the advancement of artificial intelligence, 2011, pp 566-569. https://ojs.aaai.org/index.php/ICWSM/article/view/14151

  18. Poria S, Gelbukh A, Agarwal B, Cambria E, Howard H (2013) Common sense knowledge based personality recognition from text. In: Mexican international conference on artificial intelligence. Springer, 2013, pp 484-496. https://doi.org/10.1007/978-3-642-45111-9_46

  19. Vasquez RL, Ochoa-Luna J (2021) Transformer-based approaches for personality detection using the mbti model. In: XLVII latin american computing conference (CLEI). IEEE, 2021, pp 1-7. https://doi.org/10.1109/CLEI53233.2021.9640012

  20. El-Demerdash K, El-Khoribi RA, Mahmoud A, Shoman I (2022) Deep learning based fusion strategies for personality prediction. Egypt Inform J 23:47–53. https://doi.org/10.1016/j.eij.2021.05.004

    Article  Google Scholar 

  21. Lopez-Pabon FO, Orozco-Arroyave JR (2022) Automatic personality evaluation from transliterations of youtube vlogs using classical and state-of-the-art word embedding. Ingenierıa e Investigacion 42(2) e93803. https://doi.org/10.15446/ing.investig

  22. Ren Z, Shen Q, Diao X, Xu H (2021) A sentiment-aware deep learning approach for personality detection from text. Inf Process Manag 58(3):102532. https://doi.org/10.1016/j.ipm.2021.102532

    Article  Google Scholar 

  23. Jason W, Kai Z (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 6382-6388

  24. Coulombe C (2018) Text data augmentation made simple by leveraging NLP cloud APIs. arXiv:1812.04718

  25. Mohammad SM (2013) Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465. https://doi.org/10.1111/j.1467-8640.2012.00460.x

    Article  MathSciNet  Google Scholar 

  26. Mohammad S (2018) Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In: Proceedings of the 56th annual meeting of the association for computational linguistics. Association for computing machinery, pp 174-184. https://doi.org/10.18653/v1/P18-1017

  27. Chaturvedi I, Satapathy R, Cavallari S, Cambria E (2019) Fuzzy commonsense reasoning for multimodal sentiment analysis. Pattern Recognit Lett 125:264–270. https://doi.org/10.1016/j.patrec.2019.04.024

    Article  Google Scholar 

  28. Cambria E, Poria S, Hazarika D, Kwok K (2018) Senticnet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: Proceedings of the AAAI conference on artificial intelligence. Association for the advancement of artificial intelligence, pp 1795-1802. https://ojs.aaai.org/index.php/AAAI/article/view/11559

  29. Qiu X, Sun T, Xu Y et al (2020) Pre-trained models for natural language processing: a survey. Sci China Technol Sci 63:1872–1897. https://doi.org/10.1007/s11431-020-1647-3

    Article  Google Scholar 

  30. Yang JF, Ming XD, Wang Z (2017) Are sex effects on ethical decision-making fake or real? a meta-analysis on the contaminating role of social desirability response bias. Psychol Rep 120(1):25–48. https://doi.org/10.1177/0033294116682945

    Article  Google Scholar 

  31. Ronald BL (2018) Controlling social desirability bias. Int J Mark Res 61(5):534–547. https://doi.org/10.1177/1470785318805305

    Article  Google Scholar 

  32. Stajner S, Yenikent S (2021) Why Is MBTI personality detection from texts a difficult task?. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, pp 3580-3589. https://aclanthology.org/2021.eacl-main.312

  33. Wang Z, Wu CH, Li QB, Yan B, Zheng KF (2020) Encoding text information with graph convolutional networks for personality recognition. Appl Sci 10:4081. https://doi.org/10.3390/app10124081

    Article  Google Scholar 

  34. Xue X, Feng J, Sun X (2021) Semantic-enhanced sequential modeling for personality trait recognition from texts. Appl Intell 51(11):7705–7717. https://doi.org/10.1007/s10489-021-02277-7

    Article  Google Scholar 

  35. Mohades Deilami F, Sadr H, Tarkhan M (2022) Contextualized multidimensional personality recognition using combination of deep neural network and ensemble learning. Neural Process Lett. https://doi.org/10.1007/s11063-022-10787-9

    Article  Google Scholar 

  36. Mawadatul M, Hilman FP (2021) Prediction of myers-briggs type indicator personality using long short-term memory. Jurnal Elektronika dan Telekomunikasi 21(2) 104-111. https://doi.org/10.14203/jet.v21.104-111

  37. Yang F, Quan X, Yang Y, Yu JX (2021) Multi-document transformer for personality detection. In: Proceedings of the AAAI conference on artificial intelligence. vol 35, no 16, pp 14221-14229. https://ojs.aaai.org/index.php/AAAI/article/view/17673

  38. Yang T, Yang F, Ouyang H, Quan XJ (2021) Psycholinguistic tripartite graph network for personality detection. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 4229-4239. https://aclanthology.org/2021.acl-long.326

  39. Huang Y, Du C, Xue Z, Xuan YC, Zhao H, Huang LB (2021) What makes multi-modal learning better than single (Provably). In: The 35th conference on neural information processing systems (NeurIPS)

  40. Amitabha A, Aman A, Sujay S, Anupam G (2022) Impact of COVID-19 on the human personality: an analysis based on document modeling using machine learning tools. Comput J, bxab207

  41. Shappie AT, Dawson CA, Debb SM (2020) Personality as a predictor of cybersecurity behavior. Psychol Pop Media 9(4):475–480

    Article  Google Scholar 

  42. Fabio C, Lepri B (2018) Is big five better than MBTI? a personality computing challenge using twitter data. In: Fifth italian conference on computational linguistics

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Lin.

Ethics declarations

Ethics Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

All authors declare that they do not have any conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, H. DLP-personality detection: a text-based personality detection framework with psycholinguistic features and pre-trained features. Multimed Tools Appl 83, 37275–37294 (2024). https://doi.org/10.1007/s11042-023-17015-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17015-z

Keywords

Navigation