Abstract
Personality is a psychological concept which embodies the unique characteristics of an individual. An individual’s distinct traits are embodied by the psychological concept of personality. The Lexical Hypothesis states that language use and the terms people use to describe one another can help us decide personality qualities. Huge improvements in data collecting and processing have been brought about by technological breakthroughs. These could help to develop autonomous personality assessment models by deriving linguistic markers from the data present in social media, telecommunication signals, and even signals collected from human–machine interaction. Numerous studies have cantered on using machine learning to automate personality recognition from text. However, there are questions in terms of their performance, reliability as well as ethical usage. To find solutions, we extensively review and analyse the existing research in the field of personality computing using lingual markers in text. A content-oriented classification of the techniques used is provided. We also examine the existing literature for gaps and limitations with a detailed comparative analysis. The field of personality computing has the potential to impact every field of human life but the progress as of now is limited. Our review will help researchers to build from what has been achieved so far for faster progress in the field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Essays: [James W. Pennebaker, Laura A. King]. ([Year & Month of dataset creation]). [Essays], [Version 1]. Retrieved [18-08-2022] from [essays.csv | Kaggle]. (https://www.kaggle.com/datasets/manjarinandimajumdar/essayscsv) MyPersonality: The dataset was collected from Facebook by David Stillwell and Michal Kosinki for the myPersonality project. [David Stillwell, Michal Kosinski]. ([2012; Month of dataset creation]). [MyPersonality], [Version of the dataset]. Retrieved [18-08-2022] from [wiki:mypersonality_final]. Citation for the MyPersonality Project. Kosinski, M., Matz, S., Gosling, S., Popov, V. & Stillwell, D. (2015) Facebook as a Social Science Research Tool: Opportunities, Challenges, Ethical Considerations and Practical Guidelines. American Psychologist.(https://web.archive.org/web/20180428085315/http://mypersonality.org/wiki/lib/exe/fetch.php?media=wiki:mypersonality_final.zip) PAN-AP-15: [Francisco Rangel, Fabio Celli, Paolo Rosso, Martin Potthast, Benno Stein, and Walter Daelemans] ([2015; September]). [PAN15-Author-Profiling], [Version 1]. Retrieved [Date Retrieved] from [PAN Data (webis.de)]. (https://pan.webis.de/data.html#pan15-author-profiling) YouTube: [Joan-Isaac Biel, Daniel Gatica-Perez] ([2012]). [Youtube Personality], [Version 1]. The dataset was originally released by Idiap institute, no longer available for download on updated website [Youtube Personality—English (idiap.ch)] Retrieved [18-08-2022]. However, we were able to download it from OpenM [Youtube] Retrieved [18-08-2022]. (https://www.openml.org/search?type=data&sort=runs&id=41411&status=active)
References
Theophrastus. (4th Century BC). The characters.
Papurt, M. J. (1930). A study of the Woodworth psychoneurotic inventory with suggested revision. The Journal of Abnormal and Social Psychology, 25(3), 335.
Cattell, H. E., & Mead, A. D. (2008). The sixteen personality factor questionnaire (16PF).
Costa Jr, P. T., & McCrae, R. R. (2008). The revised neo personality inventory (neo-pi-r). Sage.
Briggs, K. C. (1976). Myers–Briggs type indicator. Consulting Psychologists Press.
Vinciarelli, A., & Mohammadi, G. (2014). A survey of personality computing. IEEE Transactions on Affective Computing, 5(3), 273–291.
Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6), 1296–1312. https://doi.org/10.1037/0022-3514.77.6.1296
Celli, F., Pianesi, F., Stillwell, D., & Kosinski, M. (2013, June). Workshop on computational personality recognition: Shared task. In Proceedings of the international AAAI conference on web and social media (Vol. 7, No. 1).
Rangel Pardo, F. M., Celli, F., Rosso, P., Potthast, M., Stein, B., & Daelemans, W. (2015). Overview of the 3rd Author Profiling Task at PAN 2015. In CLEF 2015 evaluation labs and workshop working notes papers (pp. 1–8).
Biel, J. I., & Gatica-Perez, D. (2012). The youtube lens: Crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 15(1), 41–55.
Hurst, M. F. (2006). Temporal text mining. In AAAI spring symposium: Computational approaches to analyzing weblogs (pp. 73–77).
Cutting, D., Kupiec, J., Pedersen, J., & Sibun, P. (1992, March). A practical part-of-speech tagger. In 3rd conference on applied natural language processing (pp. 133–140).
Zhang, Y., Jin, R., & Zhou, Z. H. (2010). Understanding bag-of-words model: A statistical framework. International Journal of Machine Learning and Cybernetics, 1(1), 43–52.
Grishman, R., & Sundheim, B. M. (1996). Message understanding conference-6: A brief history. In COLING 1996 volume 1: The 16th international conference on computational linguistics.
Chung, C., & Pennebaker, J. W. (2007). The psychological functions of function words. Social Communication, 1, 343–359.
Brown, P. F., Della Pietra, V. J., Desouza, P. V., Lai, J. C., & Mercer, R. L. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4), 467–480.
Chen, K., Zhang, Z., Long, J., & Zhang, H. (2016). Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Systems with Applications, 66, 245–260.
Le, Q., & Mikolov, T. (2014, June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188–1196). PMLR.
Davison, A. (1984). Readability—Appraising text difficulty. Learning to read in American schools: Basal readers and content texts (pp. 121–139).
Kelledy, F., & Smeaton, A. F. (1997, April). Automatic phrase recognition and extraction from text. In Proceedings of the 19th annual BCS-IRSG colloquium on IR research 19 (pp. 1–9).
Wallach, H. M. (2006, June). Topic modelling: Beyond bag-of-words. In Proceedings of the 23rd international conference on machine learning (pp. 977–984).
Lapponi, E., Read, J., & Øvrelid, L. (2012, December). Representing and resolving negation for sentiment analysis. In 2012 IEEE 12th international conference on data mining workshops (pp. 687–692). IEEE.
Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. Preprint arXiv:1801.06146.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 1.
Niu, L., Xinyu, D., Jianbing, Z., & Jiajun, C. (2015). Topic2Vec: Learning distributed representations of topics. In 2015 international conference on Asian language processing (IALP) (pp. 193–196). IEEE.
Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Young, J. C., & Rusli, A. (2019, August). Review and visualization of Facebook's FastText pretrained word vector model. In 2019 international conference on engineering, science, and industrial applications (ICESI) (pp. 1–6). IEEE.
Yao, D., Bi, J., Huang, J., & Zhu, J. (2015, July). A word distributed representation based framework for large-scale short text classification. In 2015 international joint conference on neural networks (IJCNN) (pp. 1–7). IEEE.
Matthew, E. (2018). Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer. Deep contextualized word representations. In Proc. of NAACL.
Dey, R., & Salem, F. M. (2017, August). Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS) (pp. 1597–1600). IEEE.
Merity, S., Keskar, N. S., & Socher, R. (2017). Regularizing and optimizing LSTM language models. Preprint arXiv:1708.02182.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805.
Zhou, J., Zhang, Z., Zhao, H., & Zhang, S. (2019). Limit-Bert: Linguistic informed multi-task Bert. Preprint arXiv:1910.14296.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L. & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. Preprint arXiv:1907.11692.
Mairesse, F., Walker, M. A., Mehl, M. R., & Moore, R. K. (2007). Using linguistic cues for the automatic recognition of personality in conversation and text. Journal of Artificial Intelligence Research, 30, 457–500.
Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497–505.
Whissell, C., Fournier, M., Pelland, R., Weir, D., & Makarec, K. (1986). A dictionary of affect in language: IV. Reliability, validity, and applications. Perceptual and Motor Skills, 62(3), 875–888.
Moffitt, K., Giboney, J., Ehrhardt, E., Burgoon, J. K., & Nunamaker, J. F. (2010). Structured programming for linguistic cue extraction. The Center for the Management of Information, 1, 1.
Stone, P. J., Dunphy, D. C., & Smith, M. S. (1966). The general inquirer: A computer approach to content analysis.
Cambria, E., & Hussain, A. (2015). SenticNet. In Sentic computing (pp. 23–71). Springer, Cham.
Nielsen, F. Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. Preprint arXiv:1103.2903.
Mohammad, S. M., & Turney, P. D. (2013). Nrc emotion lexicon. National Research Council, Canada, 2.
Mohammad, S. (2018, July). Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. In Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long papers) (pp. 174–184).
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39–41.
Havasi, C., Speer, R., & Alonso, J. (2007, September). ConceptNet 3: A flexible, multilingual semantic network for common sense knowledge. In Recent advances in natural language processing (pp. 27–29). John Benjamins.
Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., & Bandyopadhyay, S. (2013). Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Intelligent Systems, 28(2), 31–38.
Searle, J. R. (1975). Indirect speech acts. In Speech acts (pp. 59–82). Brill.
Searle, J. R. (1976). A classification of illocutionary acts1. Language in Society, 5(1), 1–23.
Walker, M., & Whittaker, S. (1995). Mixed initiative in dialogue: An investigation into discourse segmentation. Preprint arXiv:cmp-lg/9504007.
McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and its applications. Journal of Personality, 60(2), 175–215.
Allport, G. W., & Odbert, H. S. (1936). Trait-names: A psycho-lexical study. Psychological Monographs, 47(1), i.
Schwartz, S. H. (2007). Basic human values: Theory, measurement, and applications. Revue Française de Sociologie, 47(4), 929.
Eysenck, H. J. (1982). Personality, genetics, and behavior: Selected papers.
Newman, J. (1981). Myers, Isabel Briggs. The Myers-Briggs type indicator. Palo Alto, CA, Consulting Psychologists Press, 1976. Myers, Isabel Briggs (with Peter B. Myers). Gifts Differing. Palo Alto, CA, Consulting Psychologists Press, 1980.
Paulhus, D. L., & Williams, K. M. (2002). The dark triad of personality: Narcissism, Machiavellianism, and psychopathy. Journal of Research in Personality, 36(6), 556–563.
Goldberg, L. R. (1992). The development of markers for the Big-Five factor structure. Psychological Assessment, 4(1), 26.
Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11(2), 150–166.
Gill, A. J., & Oberlander, J. (2002). Taking care of the linguistic features of extraversion. In Proceedings of the annual meeting of the cognitive science society (Vol. 24, No. 24).
Gosling, S. D., Ko, S. J., Mannarelli, T., & Morris, M. E. (2002). A room with a cue: Personality judgments based on offices and bedrooms. Journal of Personality and Social Psychology, 82(3), 379.
Vazire, S., & Gosling, S. D. (2004). e-Perceptions: Personality impressions based on personal websites. Journal of Personality and Social Psychology, 87(1), 123.
Mehl, M. R., Gosling, S. D., & Pennebaker, J. W. (2006). Personality in its natural habitat: Manifestations and implicit folk theories of personality in daily life. Journal of Personality and Social Psychology, 90(5), 862.
Gosling, S. D., Gaddis, S., & Vazire, S. (2007). Personality impressions based on Facebook profiles. Icwsm, 7, 1–4.
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54.
Yarkoni, T. (2010). Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers. Journal of Research in Personality, 44(3), 363–373.
Holtgraves, T. (2011). Text messaging, personality, and the social context. Journal of Research in Personality, 45(1), 92–99.
Iacobelli, F., Gill, A. J., Nowson, S., & Oberlander, J. (2011, October). Large scale personality classification of bloggers. In International conference on affective computing and intelligent interaction (pp. 568–577). Springer.
Qiu, L., Lin, H., Ramsay, J., & Yang, F. (2012). You are what you tweet: Personality expression and perception on Twitter. Journal of Research in Personality, 46, 710–718. https://doi.org/10.1016/j.jrp.2012.08.008
Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Dziurzynski, L., Ramones, S. M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M. E., & Ungar, L. H. (2013). Personality, gender, and age in the language of social media: The open-vocabulary approach. PLoS ONE, 8(9), e73791.
Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., Ungar, L. H., & Seligman, M. E. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934.
Hassanein, M. M., Rady, S., Hussein, W., & Gharib, T. (2021). Extracting relationships between Big Five model and personality characteristics in social networks. International Journal of Intelligent Computing and Information Sciences, 21(2), 41–49.
Štajner, S., & Yenikent, S. (2021, April). Why is MBTI personality detection from texts a difficult task? In Proceedings of the 16th conference of the European chapter of the association for computational linguistics: Main volume (pp. 3580–3589).
Giorgi, S., Nguyen, K. L., Eichstaedt, J. C., Kern, M. L., Yaden, D. B., Kosinski, M., Seligman, M. E., Ungar, L. H., Schwartz, H. A., & Park, G. (2022). Regional personality assessment through social media language. Journal of Personality, 90(3), 405–425.
Celli, F., Lepri, B., Biel, J. I., Gatica-Perez, D., Riccardi, G., & Pianesi, F. (2014, November). The workshop on computational personality recognition 2014. In Proceedings of the 22nd ACM international conference on multimedia (pp. 1245–1246).
Buchanan, T., & Smith, J. L. (1999). Using the Internet for psychological research: Personality testing on the World Wide Web. British Journal of Psychology, 90(1), 125–144.
Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112(4), 1036–1040.
Novikov, P., Mararitsa, L., & Nozdrachev, V. (2021). Inferred vs. traditional personality assessment: Are we predicting the same thing? arXiv e-prints. arXiv-2103.
Argamon, S., Dhawle, S., Koppel, M., & Pennebaker, J. W. (2005, June). Lexical predictors of personality type. In Proceedings of the 2005 joint annual meeting of the interface and the classification society of North America (pp. 1–16).
Mairesse, F., & Walker, M. (2006). Words mark the nerds: Computational models of personality recognition through language. In Proceedings of the annual meeting of the cognitive science society (Vol. 28, No. 28).
Oberlander, J., & Nowson, S. (2006, July). Whose thumb is it anyway? Classifying author personality from weblog text. In Proceedings of the COLING/ACL 2006 main conference poster sessions (pp. 627–634).
Nowson, S., & Oberlander, J. (2007, March). Identifying more bloggers: Towards large scale personality classification of personal weblogs. In Proceedings of the international conference on weblogs and social.
Estival, D., Gaustad, T., Pham, S. B., Radford, W., & Hutchinson, B. (2007, September). Author profiling for English emails. In Proceedings of the 10th conference of the Pacific association for computational linguistics (Vol. 263, p. 272).
Golbeck, J., Robles, C., Edmondson, M., & Turner, K. (2011, October). Predicting personality from twitter. In 2011 IEEE third international conference on privacy, security, risk and trust and 2011 IEEE third international conference on social computing (pp. 149–156). IEEE.
Golbeck, J., Robles, C., & Turner, K. (2011). Predicting personality with social media. In CHI'11 extended abstracts on human factors in computing systems (pp. 253–262).
Quercia, D., Kosinski, M., Stillwell, D., & Crowcroft, J. (2011, October). Our twitter profiles, our selves: Predicting personality with twitter. In 2011 IEEE third international conference on privacy, security, risk and trust and 2011 IEEE third international conference on social computing (pp. 180–185). IEEE.
Adali, S., & Golbeck, J. (2012, August). Predicting personality with social behavior. In 2012 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 302–309). IEEE.
Bai, S., Zhu, T., & Cheng, L. (2012). Big-five personality prediction based on user behaviours at social network sites. Preprint arXiv:1204.4809.
Kermanidis, K. L. (2012, May). Mining authors’ personality traits from Modern Greek spontaneous text. In Proceedings of workshop on corpora for research on emotion sentiment and social signals, in conjunction with LREC (pp. 90–93).
Wald, R., Khoshgoftaar, T., & Sumner, C. (2012, August). Machine prediction of personality from Facebook profiles. In 2012 IEEE 13th international conference on information reuse and integration (IRI) (pp. 109–115). IEEE.
Shen, J., Brdiczka, O., & Liu, J. (2013, June). Understanding email writers: Personality prediction from email messages. In International conference on user modelling, adaptation, and personalization (pp. 318–330). Springer.
Alam, F., Stepanov, E. A., & Riccardi, G. (2013). Personality traits recognition on social network-Facebook. In Proceedings of the international AAAI conference on web and social media (Vol. 7, No. 2, pp. 6–9).
Verhoeven, B., Daelemans, W., & De Smedt, T. (2013, June). Ensemble methods for personality recognition. In Seventh international AAAI conference on weblogs and social media.
Farnadi, G., Zoghbi, S., Moens, M. F., & De Cock, M. (2013, June). Recognising personality traits using Facebook status updates. In Proceedings of the international AAAI conference on web and social media (Vol. 7, No. 1).
Tomlinson, M. T., Hinote, D., & Bracewell, D. B. (2013, June). Predicting conscientiousness through semantic analysis of Facebook posts. In Seventh international AAAI conference on weblogs and social media.
Markovikj, D., Gievska, S., Kosinski, M., & Stillwell, D. J. (2013, June). Mining Facebook data for predictive personality modelling. In Seventh international AAAI conference on weblogs and social media.
Iacobelli, F., & Culotta, A. (2013, June). Too neurotic, not too friendly: Structured personality classification on textual data. In Seventh international AAAI conference on weblogs and social media.
Appling, D., Briscoe, E., Hayes, H., & Mappus, R. (2013, June). Towards automated personality identification using speech acts. In Proceedings of the international AAAI conference on web and social media (Vol. 7, No. 1).
Mohammad, S., & Kiritchenko, S. (2013, June). Using nuances of emotion to identify personality. In Seventh international AAAI conference on weblogs and social media.
Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., & Howard, N. (2013, November). Common sense knowledge based personality recognition from text. In Mexican international conference on artificial intelligence (pp. 484–496). Springer
Zuo, X., Feng, B., Yao, Y., Zhang, T., Zhang, Q., Wang, M., & Zuo, W. (2013, September). A weighted ML-KNN model for predicting users’ personality traits. In Proc. Int. Conf. Inf. Sci. Comput. Appl. (ISCA) (pp. 345–350).
Gou, L., Zhou, M. X., & Yang, H. (2014, April). KnowMe and ShareMe: Understanding automatically discovered personality traits from social media and user sharing preferences. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 955–964).
Pratama, B. Y., & Sarno, R. (2015, November). Personality classification based on Twitter text using Naive Bayes, KNN and SVM. In 2015 international conference on data and software engineering (ICoDSE) (pp. 170–174). IEEE.
Arroju, M., Hassan, A., & Farnadi, G. (2015). Age, gender and personality recognition using tweets in a multilingual setting. In 6th conference and labs of the evaluation forum (CLEF 2015): Experimental IR meets multilinguality, multimodality, and interaction (Vol. 23, p. 31).
Poddar, S., Kattagoni, V., & Singh, N. (2015). Personality mining from biographical data with the" Adjectival Marker" Technique. In BD (pp. 39–47).
Lukito, L. C., Erwin, A., Purnama, J., & Danoekoesoemo, W. (2016, October). Social media user personality classification using computational linguistic. In 2016 8th international conference on information technology and electrical engineering (ICITEE) (pp. 1–6). IEEE.
Pramodh, K. C., & Vijayalata, Y. (2016, October). Automatic personality recognition of authors using big five factor model. In 2016 IEEE international conference on advances in computer applications (ICACA) (pp. 32–37). IEEE.
Ong, V., Rahmanto, A. D. S., Williem, W., Suhartono, D., Nugroho, A. E., Andangsari, E. W., & Suprayogi, M. N. (11 2017). Personality prediction based on Twitter information in Bahasa Indonesia. In 2017 federated conference on computer science and information systems (FedCSIS) (pp. 367–372). https://doi.org/10.15439/2017F359.
Tandera, T., Suhartono, D., Wongso, R., & Prasetio, Y. L. (2017). Personality prediction system from Facebook users. Procedia Computer Science, 116, 604–611.
Ahmad, Z., Lutfi, S. L., Kushan, A. L., & Yixing, R. T. (2017). Personality prediction of Malaysian Facebook users: Cultural preferences and features variation. Advanced Science Letters, 23(8), 7900–7903.
Yata, A., Kante, P., Sravani, T., & Malathi, B. (2018). Personality recognition using multi-label classification. International Research Journal of Engineering and Technology (IRJET), 5(03), 1.
Arjaria, S., Shrivastav, A., Rathore, A. S., & Tiwari, V. (2019). Personality trait identification for written texts using MLNB. In Data, engineering and applications (pp. 131–137). Springer.
Artissa, Y. B. N. D., Asror, I., & Faraby, S. A. (5 2019). Personality classification based on Facebook status text using Multinomial Naïve Bayes method (p. 1192). https://doi.org/10.1088/1742-6596/1192/1/012003
Ergu, İ., Işık, Z., & Yankayış, İ. (2019). Predicting personality with twitter data and machine learning models. In 2019 innovations in intelligent systems and applications conference (ASYU) (pp. 1–5). IEEE.
Rohit, G. V., Bharadwaj, K. R., Hemanth, R., Pruthvi, B., & Kumar, M. (2020, August). Machine intelligence based personality prediction using social profile data. In 2020 3rd international conference on smart systems and inventive technology (ICSSIT) (pp. 1003–1008). IEEE.
Ong, V., Rahmanto, A. D. S., Williem, W., Jeremy, N. H., Suhartono, D., & Andangsari, E. W. (2021). Personality modelling of Indonesian Twitter users with XGBoost based on the five factor model. International Journal of Intelligent Engineering and Systems, 14, 248–261. https://doi.org/10.22266/ijies2021.0430.22
Safitri, G., & Setiawan, E. B. (2022). Optimization prediction of big five personality in twitter users. Journal RESTI (Rekayasa Sistem dan Teknologi Informasi), 6, 85–91. https://doi.org/10.29207/resti.v6i1.3529
Vu, X. S., Flekova, L., Jiang, L., & Gurevych, I. (2018, January). Lexical-semantic resources: Yet powerful resources for automatic personality classification. In Proceedings of the 9th global WORDNET conference (pp. 172–181).
Fernandes, B., González-Briones, A., Novais, P., Calafate, M., Analide, C., & Neves, J. (2020). An adjective selection personality assessment method using gradient boosting machine learning. Processes, 8(5), 618.
Kalghatgi, M. P., Ramannavar, M., & Sidnal, N. S. (2015). A neural network approach to personality prediction based on the big-five model. International Journal of Innovative Research in Advanced Engineering (IJIRAE), 2(8), 56–63.
Su, M. H., Wu, C. H., & Zheng, Y. T. (2016). Exploiting turn-taking temporal evolution for personality trait perception in dyadic conversations. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(4), 733–744.
Liu, F., Perez, J., & Nowson, S. (2016). A language-independent and compositional model for personality trait recognition from short texts. Preprint arXiv:1610.04345.
Xianyu, H., Xu, M., Wu, Z., & Cai, L. (2016, July). Heterogeneity-entropy based unsupervised feature learning for personality prediction with cross-media data. In 2016 IEEE international conference on multimedia and Expo (ICME) (pp. 1–6). IEEE.
Sun, X., Liu, B., Cao, J., Luo, J., & Shen, X. (2018, May). Who am I? Personality detection based on deep learning for texts. In 2018 IEEE international conference on communications (ICC) (pp. 1–6). IEEE.
An, G., & Levitan, R. (2018, February). Lexical and acoustic deep learning model for personality recognition. In INTERSPEECH (pp. 1761–1765).
Yılmaz, T., Ergil, A., & İlgen, B. (2019, October). Deep learning-based document modelling for personality detection from Turkish Texts. In Proceedings of the future technologies conference (pp. 729–736). Springer.
Kazameini, A., Fatehi, S., Mehta, Y., Eetemadi, S., & Cambria, E. (2020). Personality trait detection using bagged SVM over BERT word embedding ensembles. Preprint arXiv:2010.01309.
Leonardi, S., Monti, D., Rizzo, G., & Morisio, M. (2020). Multilingual transformer-based personality traits estimation. Information, 11(4), 179.
Xue, X., Feng, J., & Sun, X. (2021). Semantic-enhanced sequential modeling for personality trait recognition from texts. Applied Intelligence, 51(11), 7705–7717.
El-Demerdash, K., El-Khoribi, R. A., Shoman, M. A. I., & Abdou, S. (2021). Deep learning based fusion strategies for personality prediction. Egyptian Informatics Journal, 1, 1.
Christian, H., Suhartono, D., Chowanda, A., & Zamli, K. Z. (2021). Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging. Journal of Big Data, 8(1), 1–20.
Jeremy, N. H., & Suhartono, D. (2021). Automatic personality prediction from Indonesian user on twitter using word embedding and neural networks. Procedia Computer Science, 179, 416–422.
Mavis, G., Toroslu, I. H., & Karagoz, P. (2021). Personality analysis using classification on Turkish tweets. International Journal of Cognitive Informatics and Natural Intelligence, 15, 1–18. https://doi.org/10.4018/ijcini.287596
Kosan, M. A., Karacan, H., & Urgen, B. A. (2022). Predicting personality traits with semantic structures and LSTM-based neural networks. Alexandria Engineering Journal, 61(10), 8007–8025.
Majumder, N., Poria, S., Gelbukh, A., & Cambria, E. (2017). Deep learning-based document modeling for personality detection from text. IEEE Intelligent Systems, 32(2), 74–79.
Yu, J., & Markov, K. (2017, November). Deep learning based personality recognition from Facebook status updates. In 2017 IEEE 8th international conference on awareness science and technology (iCAST) (pp. 383–387). IEEE.
Giménez, M., Paredes, R., & Rosso, P. (2017, April). Personality recognition using convolutional neural networks. In International conference on computational linguistics and intelligent text processing (pp. 313–323). Springer.
Xue, D., Wu, L., Hong, Z., Guo, S., Gao, L., Wu, Z., & Sun, J. (2018). Deep learning-based personality recognition from text posts of online social networks. Applied Intelligence, 48(11), 4232–4246.
Rahman, M. A., Al Faisal, A., Khanam, T., Amjad, M., & Siddik, M. S. (2019, May). Personality detection from text using convolutional neural network. In 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT) (pp. 1–6). IEEE.
Darliansyah, A., Naeem, M. A., Mirza, F., & Pears, R. (2019). SENTIPEDE: A smart system for sentiment-based personality detection from short texts. Journal of Universal Computer Science, 25, 1323–1352. https://doi.org/10.3217/jucs-025-10-1323
Mehta, Y., Fatehi, S., Kazameini, A., Stachl, C., Cambria, E., & Eetemadi, S. (2020, November). Bottom-up and top-down: Predicting personality with psycholinguistic and language model features. In 2020 IEEE international conference on data mining (ICDM) (pp. 1184–1189). IEEE.
Deilami, F. M., Sadr, H., & Nazari, M. (2022). Using machine learning based models for personality recognition. Preprint arXiv:2201.06248.
Deilami, F. M., Sadr, H., & Tarkhan, M. (2022). Contextualized multidimensional personality recognition using combination of deep neural network and ensemble learning. Neural Processing Letters. https://doi.org/10.1007/s11063-022-10787-9
Guan, Z., Wu, B., Wang, B., & Liu, H. (2020, July). Personality2vec: Network representation learning for personality. In 2020 IEEE 5th international conference on data science in cyberspace (DSC) (pp. 30–37). IEEE.
Wang, Z., Wu, C. H., Li, Q. B., Yan, B., & Zheng, K. F. (2020). Encoding text information with graph convolutional networks for personality recognition. Applied Sciences, 10(12), 4081.
Wang, Y., Zheng, J., Li, Q., Wang, C., Zhang, H., & Gong, J. (2021). Xlnet-caps: Personality classification from textual posts. Electronics (Switzerland). https://doi.org/10.3390/electronics10111360
Ramezani, M., Feizi-Derakhshi, M. R., & Balafar, M. A. (2022). Knowledge graph-enabled text-based automatic personality prediction. Preprint arXiv:2203.09103.
Jiang, H., Zhang, X., & Choi, J. D. (2020, April). Automatic text-based personality recognition on monologues and multiparty dialogues using attentive networks and contextual embeddings (student abstract). In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 10, pp. 13821–13822).
Li, Y., Kazemeini, A., Mehta, Y., & Cambria, E. (2022). Multitask learning for emotion and personality traits detection. Neurocomputing, 493, 340–350. https://doi.org/10.1016/j.neucom.2022.04.049
Celli, F. (2012, March). Unsupervised personality recognition for social network sites. In Procedings of sixth international conference on digital society (pp. 59–62).
Celli, F., & Rossi, L. (2012, April). The role of emotional stability in Twitter conversations. In Proceedings of the workshop on semantic analysis in social media (pp. 10–17).
Liu and Zhu proposed use of stacked AutoEncoders for unsupervised learning of Linguistic Representation Feature Vector (LRFV) based on SLIWC and FFT from Sina microblog. The features obtained were used to train a Linear Regression model and results outperform the selected baselines.
Alsadhan, N., & Skillicorn, D. (2017, November). Estimating personality from social media posts. In 2017 IEEE international conference on data mining workshops (ICDMW) (pp. 350–356). IEEE.
Celli, F., & Lepri, B. (2018). Is big five better than MBTI? A personality computing challenge using Twitter data. Computational Linguistics CLiC-it, 2018, 93.
Lima, A. C., & de Castro, L. N. (2013, September). Multi-label semi-supervised classification applied to personality prediction in Tweets. In 2013 BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence (pp. 195–203). IEEE.
Lima, A. C. E., & De Castro, L. N. (2014). A multi-label, semi-supervised classification approach applied to personality prediction in social media. Neural Networks, 58, 122–130.
Tighe, E. P., Ureta, J. C., Pollo, B. A. L., Cheng, C. K., & de Dios Bulos, R. (2016, July). Personality trait classification of essays with the application of feature reduction. In SAAIP@ IJCAI (pp. 22–28).
Tighe, E., & Cheng, C. (2018, June). Modeling personality traits of Filipino twitter users. In Proceedings of the 2nd workshop on computational modelling of people’s opinions, personality, and emotions in social media (pp. 112–122).
Mao, Y., Zhang, D., Wu, C., Zheng, K., & Wang, X. (2018, December). Feature analysis and optimisation for computational personality recognition. In 2018 IEEE 4th international conference on computer and communications (ICCC) (pp. 2410–2414). IEEE.
Adi, G. Y. N., Tandio, M. H., Ong, V., & Suhartono, D. (2018). Optimization for automatic personality recognition on Twitter in Bahasa Indonesia. Procedia Computer Science, 135, 473–480.
Carducci, G., Rizzo, G., Monti, D., Palumbo, E., & Morisio, M. (2018). Twitpersonality: Computing personality traits from tweets using word embeddings and supervised learning. Information, 9(5), 127.
Dos Santos, W. R., Ramos, R. M., & Paraboni, I. (2019). Computational personality recognition from facebook text: Psycholinguistic features, words and facets. New Review of Hypermedia and Multimedia, 25(4), 268–287.
Akrami, N., Fernquist, J., Isbister, T., Kaati, L., & Pelzer, B. (2019, December). Automatic extraction of personality from text: Challenges and opportunities. In 2019 IEEE international conference on big data (big data) (pp. 3156–3164). IEEE.
Zheng, H., & Wu, C. (2019, February). Predicting personality using Facebook status based on semi-supervised learning. In Proceedings of the 2019 11th international conference on machine learning and computing (pp. 59–64).
Tighe, E., Aran, O., & Cheng, C. (2020). Exploring neural network approaches in automatic personality recognition of Filipino Twitter users.
Pabón, F. O. L., & Arroyave, J. R. O. (12 2021). Automatic personality evaluation from transliterations of YouTube Vlogs using classical and state of the art word embeddings. Ingeniería e Investigación, 42, e93803. https://doi.org/10.15446/ing.investig.93803
Alamsyah, A., Putra, M. R. D., Fadhilah, D. D., Nurwianti, F., & Ningsih, E. (2018, May). Ontology modelling approach for personality measurement based on social media activity. In 2018 6th international conference on information and communication technology (ICoICT) (pp. 507–513). IEEE.
Alamsyah, A., Nurwiant, F., Rachman, M. F., Hudaya, C. S., Putra, R. P., Rifkyano, A. I., & Nurwianti, F. (2019a). A progress on the personality measurement model using ontology based on social media text cite this paper personality measurement design for ontology-based plat form using social media text. In Andry Alamsyah ontology modelling approach for personality measurement based on social media activity a progress on the personality measurement model using ontology based on social media text.
Alamsyah, A., Dudija, N., & Widiyanesti, S. (2021). New approach of measuring human personality traits using ontology-based model from social media data. Information (Switzerland). https://doi.org/10.3390/info12100413
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare there are no known competing monetary interests or personal relationships that could influence the work reported in this paper.
Limitations of the survey
We tried our best to include all the works in the domain of Automatic Personality Recognition from text, which makes comparing the computational techniques extremely difficult. Further, non-inclusion of trait theories other than the Big 5 model might have led to missing computational techniques that have better performance.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Teli, M.A., Chachoo, M.A. Lingual markers for automating personality profiling: background and road ahead. J Comput Soc Sc 5, 1663–1707 (2022). https://doi.org/10.1007/s42001-022-00184-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42001-022-00184-6