Abstract
Text-based personality detection aims to identify the personality traits implied in subject-supplied textual data. However, over-reliance on pre-trained language models and neglect of psycholinguistic features has become a bottleneck in personality detection. In this work, we conduct extensive feature-level ablation experiments using multiple psycholinguistic features to verify the importance of psycholinguistic features for personality detection. Furthermore, we propose a novel personality detection framework, DLP-Personality Detection, which fuses multiple psycholinguistic features and pre-trained language features. With the DLP-Personality Detection, we achieve state-of-the-art performance for the Big Five personality traits (Big 5) and Myers-Briggs Type Indicator (MBTI) personality traits on the Essays and Kaggle MBTI datasets. Finally, we provide some suggestions for psycholinguistic features and discuss future work for personality detection.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from https://github.com/ml-papers-coders/Keras-BigFive-personality-traits/blog/ and https://www.kaggle.com/datasets/datanaek/mbit-type.
Notes
pypi.org/project/readability/
References
Alessandro V, Gelareh M (2014) A survey of personality computing. IEEE Trans Affect Comput 5(3):273–291. https://doi.org/10.1109/TAFFC.2014.2330816
Shumanov M, Johnson L (2021) Making conversations with chatbots more personalized. Comput Hum Behav 117:106627. https://doi.org/10.1016/j.chb.2020.106627
Aguiar JJB, Fechine JM, Costa EB (2020) Collaborative filtering strategy for product recommendation using personality characteristics of customers. In: Proceedings of the brazilian symposium on multimedia and the web. Association for computational linguistics, pp 157-164. https://doi.org/10.1145/3428658.3430969
Majaluoma S, Seppala T, Kautiainen H, Korhonen P (2020) Type D personality and metabolic syndrome among Finnish female municipal workers. BMC Womens Health 20(1):202. https://doi.org/10.1186/s12905-020-01052-z
Kazameini A, Fatehi S, Mehta Y, Eetemadi S, Cambria B (2020) Personality trait detection using bagged svm over bert word embedding ensembles. In: The ACL 2020 workshop on Widening NLP. Association for computational linguistics
Jiang H, Zhang XZ, Choi DJ (2020) Automatic text-based personality recognition on monologues and multiparty dialogues using attentive networks and contextual embeddings. In: Proceedings of the AAAI conference on artificial intelligence (Student Abstract). Association for the advancement of artificial intelligence, pp 13821-13822. https://doi.org/10.1609/aaai.v34i10.7182
Mehta Y, Fatehi S, Kazameini A, Stachl C, Cambria E, Eetemadi S (2020) Bottom-up and top-down: predicting personality with psychopsycholinguistic and language model features. In: Proceedings of 2020 IEEE international conference on data mining. IEEE, pp 1184-1189. https://doi.org/10.1109/ICDM50108.2020.00146
Zhu H, Li L, Jiang H (2018) Inferring personality traits from user liked images via weakly supervised dual convolutional network. In: The joint workshop of the 4th workshop on affective social multimedia computing and first multi-modal affective computing of large-scale multimedia data. Association for computing machinery, pp 63-69. https://doi.org/10.1145/3267935.3267953
Zen G, Lepri E, Ricci E, Lanz O, Bruno F, Fbkirst K (2020) Space speaks: towards socially and personality aware visual surveillance. In: ACM Int’l workshop on multimodal pervasive video analysis. Association for computing machinery, 2020, pp 37-42. https://doi.org/10.1145/1878039.1878048
Quercia D, Kosinski M, Stillwell D, Crowcroft J (2011) Our twitter profiles, our selves: Predicting personality with twitter. In: Proceedings of the 3rd international conference on privacy, security, risk and trust and the 3rd international conference on social computing,2011, pp 180-185. https://doi.org/10.1109/PASSAT/SocialCom.2011.26
Li W, Hu X, Long X, Tang L, Chen J, Wang F, Zhang D (2020) EEG responses to emotional videos can quantitatively predict big-five personality traits. Neurocomputing 415:368–381. https://doi.org/10.1016/j.neucom.2020.07.123
Wang Y, Zheng J, Li Q, Wang C, Zhang H, Gong J (2021) Xlnet-caps: personality classification from textual posts. Electronics 10(11):1360. https://doi.org/10.3390/electronics10111360
Tausczik Y, Pennebaker J (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54
Stajner S, Yenikent S. (2020) A survey of automatic personality detection from texts. In: Proceedings of the 28th international conference on computational linguistics. Association for computational linguistics, pp 6284-6295. https://doi.org/10.18653/v1/2020.coling-main.553
Mairesse F, Walker M, Mehl M, Moore R (2007) Using psycholinguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500. https://doi.org/10.1613/jair.2349
Argamon S, Koppel DSM, Pennebaker J (2005) Lexical predictors of personality type. Proceedings of the joint annual meeting of the interface and the classification society of north america 2005:1–16
Nguyen T, Phung D, Adams B, Venkatesh S (2011) Towards discovery of influence and personality traits through social link prediction. In: Proceedings of the international AAAI conference on web and social media. Association for the advancement of artificial intelligence, 2011, pp 566-569. https://ojs.aaai.org/index.php/ICWSM/article/view/14151
Poria S, Gelbukh A, Agarwal B, Cambria E, Howard H (2013) Common sense knowledge based personality recognition from text. In: Mexican international conference on artificial intelligence. Springer, 2013, pp 484-496. https://doi.org/10.1007/978-3-642-45111-9_46
Vasquez RL, Ochoa-Luna J (2021) Transformer-based approaches for personality detection using the mbti model. In: XLVII latin american computing conference (CLEI). IEEE, 2021, pp 1-7. https://doi.org/10.1109/CLEI53233.2021.9640012
El-Demerdash K, El-Khoribi RA, Mahmoud A, Shoman I (2022) Deep learning based fusion strategies for personality prediction. Egypt Inform J 23:47–53. https://doi.org/10.1016/j.eij.2021.05.004
Lopez-Pabon FO, Orozco-Arroyave JR (2022) Automatic personality evaluation from transliterations of youtube vlogs using classical and state-of-the-art word embedding. Ingenierıa e Investigacion 42(2) e93803. https://doi.org/10.15446/ing.investig
Ren Z, Shen Q, Diao X, Xu H (2021) A sentiment-aware deep learning approach for personality detection from text. Inf Process Manag 58(3):102532. https://doi.org/10.1016/j.ipm.2021.102532
Jason W, Kai Z (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 6382-6388
Coulombe C (2018) Text data augmentation made simple by leveraging NLP cloud APIs. arXiv:1812.04718
Mohammad SM (2013) Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465. https://doi.org/10.1111/j.1467-8640.2012.00460.x
Mohammad S (2018) Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In: Proceedings of the 56th annual meeting of the association for computational linguistics. Association for computing machinery, pp 174-184. https://doi.org/10.18653/v1/P18-1017
Chaturvedi I, Satapathy R, Cavallari S, Cambria E (2019) Fuzzy commonsense reasoning for multimodal sentiment analysis. Pattern Recognit Lett 125:264–270. https://doi.org/10.1016/j.patrec.2019.04.024
Cambria E, Poria S, Hazarika D, Kwok K (2018) Senticnet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: Proceedings of the AAAI conference on artificial intelligence. Association for the advancement of artificial intelligence, pp 1795-1802. https://ojs.aaai.org/index.php/AAAI/article/view/11559
Qiu X, Sun T, Xu Y et al (2020) Pre-trained models for natural language processing: a survey. Sci China Technol Sci 63:1872–1897. https://doi.org/10.1007/s11431-020-1647-3
Yang JF, Ming XD, Wang Z (2017) Are sex effects on ethical decision-making fake or real? a meta-analysis on the contaminating role of social desirability response bias. Psychol Rep 120(1):25–48. https://doi.org/10.1177/0033294116682945
Ronald BL (2018) Controlling social desirability bias. Int J Mark Res 61(5):534–547. https://doi.org/10.1177/1470785318805305
Stajner S, Yenikent S (2021) Why Is MBTI personality detection from texts a difficult task?. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume, pp 3580-3589. https://aclanthology.org/2021.eacl-main.312
Wang Z, Wu CH, Li QB, Yan B, Zheng KF (2020) Encoding text information with graph convolutional networks for personality recognition. Appl Sci 10:4081. https://doi.org/10.3390/app10124081
Xue X, Feng J, Sun X (2021) Semantic-enhanced sequential modeling for personality trait recognition from texts. Appl Intell 51(11):7705–7717. https://doi.org/10.1007/s10489-021-02277-7
Mohades Deilami F, Sadr H, Tarkhan M (2022) Contextualized multidimensional personality recognition using combination of deep neural network and ensemble learning. Neural Process Lett. https://doi.org/10.1007/s11063-022-10787-9
Mawadatul M, Hilman FP (2021) Prediction of myers-briggs type indicator personality using long short-term memory. Jurnal Elektronika dan Telekomunikasi 21(2) 104-111. https://doi.org/10.14203/jet.v21.104-111
Yang F, Quan X, Yang Y, Yu JX (2021) Multi-document transformer for personality detection. In: Proceedings of the AAAI conference on artificial intelligence. vol 35, no 16, pp 14221-14229. https://ojs.aaai.org/index.php/AAAI/article/view/17673
Yang T, Yang F, Ouyang H, Quan XJ (2021) Psycholinguistic tripartite graph network for personality detection. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 4229-4239. https://aclanthology.org/2021.acl-long.326
Huang Y, Du C, Xue Z, Xuan YC, Zhao H, Huang LB (2021) What makes multi-modal learning better than single (Provably). In: The 35th conference on neural information processing systems (NeurIPS)
Amitabha A, Aman A, Sujay S, Anupam G (2022) Impact of COVID-19 on the human personality: an analysis based on document modeling using machine learning tools. Comput J, bxab207
Shappie AT, Dawson CA, Debb SM (2020) Personality as a predictor of cybersecurity behavior. Psychol Pop Media 9(4):475–480
Fabio C, Lepri B (2018) Is big five better than MBTI? a personality computing challenge using twitter data. In: Fifth italian conference on computational linguistics
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of Interest
All authors declare that they do not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, H. DLP-personality detection: a text-based personality detection framework with psycholinguistic features and pre-trained features. Multimed Tools Appl 83, 37275–37294 (2024). https://doi.org/10.1007/s11042-023-17015-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17015-z