Abstract
This study uses a data-driven approach to mine the distribution of personality traits among Chinese people in the Chinese social context. Based on the hypothesis of personality lexicology, word embedding technology was employed in machine learning to mine personality vocabulary from Tencent’s word embedding database. More than 10,000 Chinese personality descriptors were extracted and analyzed using Gaussian Mixture Model Cluster and Hierarchical clustering analysis. The data was collected from 658 Chinese people randomly from all parts of China through an online questionnaire method. The results reveal six personality traits in the Chinese context, expanding the personality thesaurus and providing examples to illustrate each trait. The findings coincide with previous research on the five-factor model, which partially describes the personality traits of Chinese people, but does not offer a complete explanation of their typical social behavior patterns. Additionally, the study supports the notion of cultural particularity in personality traits. The approach used in this study offers a richer personality vocabulary than traditional personality mining methods, and word embedding technology captures richer semantic information in Chinese. The six Chinese personality traits identified in this study will also be used to explore how to quantify and evaluate personality traits based on word embedding and personality descriptors.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Agbaria, Q., & Mokh, A. A. (2022). Coping with stress during the coronavirus outbreak: The contribution of big five personality traits and social support. International Journal of Mental Health and Addiction, 20(3), 1854–1872. https://doi.org/10.1007/s11469-021-00486-2.
Alderotti, G., Rapallini, C., & Traverso, S. (2023). The big five personality traits and earnings: A meta-analysis. Journal of Economic Psychology, 94, 102570.
Arslan, E. Y., Yildirim, O., Kaynas, T., & Atanasov, K. (2023). Exploring the impact of Digitalized Learning and Teaching systems on the big five personality traits. Multidimensional and Strategic Outlook in Digital Business Transformation: Human resource and management recommendations for performance improvement (pp. 165–176). Springer International Publishing.
Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11, 150–166. https://doi.org/10.1177/1088868306294907.
Ashton, M. C., & Lee, K. (2020). Objections to the HEXACO model of personality structure—and why those objections fail. European Journal of Personality, 34(4), 492–510. https://doi.org/10.1002/per.2242.
Bertsch, A., Ondracek, J., Saeed, M., Hulm, J., Borud, D., McCloud, M., & Jisheng, L. (2021). Exploring similarities and differences in big 5 personality traits of students’ declared University Major at a Regional USA University. Delhi Business Review, 22(2), 59–74.
Busseri, M. A., & Erb, E. M. (2023). The happy personality revisited: Re-examining associations between big five personality traits and subjective well‐being using meta‐analytic structural equation modeling. Journal of Personality.
Church, A. T. (2016). Personality traits across cultures. Current Opinion in Psychology, 8, 22–30. https://doi.org/10.1016/j.copsyc.2015.09.014.
Cutler, A., & Condon, D. M. (2023). Deep lexical hypothesis: Identifying personality structure in natural language. Journal of Personality and Social Psychology, 125(1), 173.
Deho, B. O., Agangiba, A. W., Aryeh, L. F., & Ansah, A. J. (2018). Sentiment analysis with word embedding. In 2018 IEEE 7th International Conference on Adaptive Science & Technology (ICAST) (pp. 1–4). IEEE. https://doi.org/10.1109/ICASTECH.2018.8506717.
Feher, A., & Vernon, P. A. (2021). Looking beyond the big five: A selective review of alternatives to the big five model of personality. Personality and Individual Differences, 169, 110002. https://doi.org/10.1016/j.paid.2020.110002.
Figueroa, A., Ghosh, S., & Aragon, C. (2023). Generalized Cohen’s kappa: a novel inter-rater reliability metric for non-mutually exclusive categories. In International Conference on Human-Computer Interaction (pp. 19–34). Cham: Springer Nature Switzerland.
Govender, P., & Sivakumar, V. (2020). Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmospheric Pollution Research, 11(1), 40–56. https://doi.org/10.1016/j.apr.2019.09.009.
Guo, Y. J. (2010). The development of moral personality Vocabulary rating Scale. Zhengzhou University. https://kns.cnki.net/KCMS/detail/detail.aspx? dbname = CMFD2011&filename = 2011010882.nh (in Chinese).
Guo, H., Ma, J., & Ma, Z. (2018). Active Semi-supervised K-Means Clustering Based on Silhouette Coefficient. In International Conference on Intelligent and Interactive Systems and Applications (pp. 202–209). Springer, Cham. https://doi.org/10.1007/978-3-030-02804-6_27.
He, M., Ma, C., & Wang, R. (2022). A Data-Driven Approach for University Public Opinion Analysis and Its Applications. Applied Sciences, 12(18), 9136.https://doi.org/10.3390/app12189136.
Hinton, G. E. (1986). Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society (Vol. 1, p. 12). http://www.cs.toronto.edu/~hinton/absps/families.pdf.
Huang Xiting (2014). Exploring the mystery of personality. Beijing: The Commercial Press ISBN: 978-7-100-10320-6. Retrieved from: https://item.jd.com/10034222676742.html (in Chinese)
Jiao, L., Denœux, T., Liu, Z. G., & Pan, Q. (2022). EGMM: An evidential version of the Gaussian mixture model for clustering. Applied Soft Computing, 129, 109619. https://doi.org/10.1016/j.asoc.2022.109619.
Kachur, A., Osin, E., Davydov, D., Shutilov, K., & Novokshonov, A. (2020). Assessing the big five personality traits using real-life static facial images. Scientific Reports, 10(1), 1–11. https://doi.org/10.1038/s41598-020-65358-6.
Kajonius, P., & Mac Giolla, E. (2017). Personality traits across countries: Support for similarities rather than differences. PloS One, 12(6), e0179646. https://doi.org/10.1371/journal.pone.0179646.
Kim, L. E., Jörg, V., & Klassen, R. M. (2019). A meta-analysis of the effects of teacher personality on teacher effectiveness and burnout. Educational Psychology Review, 31(1), 163–195. https://doi.org/10.1007/s10648-018-9458-2.
Lee, K., & Ashton, M. C. (2020). HEXACO Model of personality. The Wiley Encyclopedia of Personality and Individual Differences: Models and Theories, 249–256. https://doi.org/10.1002/9781119547143.ch42.
Li, K., Ma, Z., Robinson, D., & Ma, J. (2018). Identification of typical building daily electricity usage profiles using Gaussian mixture model-based clustering and hierarchical clustering. Applied Energy, 231, 331–342. https://doi.org/10.1016/j.apenergy.2018.09.050.
Li, N., Sun, D., & Wang, S. (2023). Semantic preview effect of relatedness and plausibility in reading Chinese: Evidence from high constraint sentences. Reading and Writing, 36(5), 1319–1338.
Luo, X., Chen, H. H., & Guo, Q. (2022). Semantic communications: Overview, open issues, and future research directions. IEEE Wireless Communications, 29(1), 210–219.
Mu, & Gu Haigen. (2010). The development of virtue adjective rating scale and its reliability and validity. Chinese Journal of Clinical Psychology, 03, 310–313. https://doi.org/10.16128/j.cnki.1005-3611.2010.03.015(in Chinese).
Pan, R. (2023). Automatic Keyword Extraction Algorithm for Chinese Text based on Word Clustering. ACM Transactions on Asian and Low-Resource Language Information Processing.
Purvin, L. A. (2001). Personality science. East China Normal University Press.http://xidong.net/File001/File_53993.html.
S Harris, Z. (1954). Distributional structure. Word, 10(2-3), 146–162. https://doi.org/10.1080/00437956.1954.11659520.
Sagadevan, S., Malim, N. H. A. H., & Husin, M. H. (2022). A seed-guided latent Dirichlet Allocation Approach to predict the personality of online users using the PEN Model. Algorithms, 15(3), 87.
Saputro, D. R. S. (2022). Algoritme partitioning around medoid (pam) dengancalinski-harabasz index untuk clustering data outlier. UNEJ e-Proceeding, 22–29. Available at: https://jurnal.unej.ac.id/index.php/prosiding/article/view/33490.
Saucier, G. (2002). Gone too far - or not far enough? Comments on the article by ashton and lee (2001). European Journal of Personality, 16(1), 55–62. https://doi.org/10.1002/per.432.
Saucier, G., & Srivastava, S. (2015). What makes a good structural model of personality? Evaluating the big five and alternatives. In M. Mikulincer, P. R. Shaver, M. L. Cooper, & R. J. Larsen (Eds.), APA handbook of personality and social psychology (Vol. 4, pp. 283–305). American Psychological Association. Personality processes and individual differenceshttps://doi.org/10.1037/14343-013.
Smith, M. M., Sherry, S. B., Vidovic, V., Saklofske, D. H., Stoeber, J., & Benoit, A. (2019). Perfectionism and the five-factor model of personality: A meta-analytic review. Personality and Social Psychology Review, 23(4), 367–390. https://doi.org/10.1177/1088868318814973.
Song, Y., Shi, S., Li, J., & Zhang, H. (2018). Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2(Short Papers) (pp. 175–180). https://aclanthology.org/N18-2028.
Steppan, M. (2020). Personality adjectives in British and American English from 1800–2010. How to trace back historical trends in personality. https://doi.org/10.31234/osf.io/tpq6j.
Torregrossa, F., Allesiardo, R., Claveau, V., Kooli, N., & Gravier, G. (2021). A survey on training and evaluation of word embeddings. International Journal of Data Science and Analytics, 11(2), 85–103. https://doi.org/10.1007/s41060-021-00242-8.
Wang, D., & Cui, H. (2019). Chinese personality: Structure and measurement. In Progress in Psychological Science around the World (pp. 105–121). Routledge. ISBN 9781315793184.
Wang, H., & Zhang, Y. (2022). The effects of personality traits and attitudes towards the rule on academic dishonesty among university students. Scientific Reports, 12(1), 1–7. https://doi.org/10.1038/s41598-022-18394-3.
Wang, B., Wang, A., Chen, F., Wang, Y., & KuoC. C. J. (2019). Evaluating word embedding models: Methods and experimental results. APSIPA Transactions on Signal and Information Processing, 8, https://doi.org/10.1017/ATSIP.2019.12.
Wang, S., Zhou, W., & Jiang, C. (2020). A survey of word embeddings based on deep learning. Computing, 102(3), 717–740. https://doi.org/10.1007/s00607-019-00768-7.
Wevers, M., & Koolen, M. (2020). Digital begriffsgeschichte: Tracing semantic change using word embeddings. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 53(4), 226–243. https://doi.org/10.1080/01615440.2020.1760157.
Wu, C., Peng, Q., Lee, J., Leibnitz, K., & Xia, Y. (2021). Effective hierarchical clustering based on structural similarities in nearest neighbor graphs. Knowledge-Based Systems, 228, 107295. https://doi.org/10.1016/j.knosys.2021.107295.
Xiao, Y., Keung, J., Bennin, K. E., & Mi, Q. (2019). Improving bug localization with word embedding and enhanced convolutional neural networks. Information and Software Technology, 105, 17–29. https://doi.org/10.1016/j.infsof.2018.08.002.
Xu, Y., & Wang, P. P. (2011). Verb-based analysis to explore the Chinese model of personality structure. In meeting of the 90th Anniversary Conference of the Chinese Psychological Society and the 14th National Psychological Conference, Beijing, China.(in Chinese).
Zettler, I., Thielmann, I., Hilbig, B. E., & Moshagen, M. (2020). The nomological net of the HEXACO model of personality: A large-scale meta-analytic investigation. Perspectives on Psychological Science, 15(3), 723–760. https://doi.org/10.1177/1745691619895036.
Zhang, Z., & Wang Dengfeng. (1997). &. On the big-seven factor model of personality trait descriptors. Journal of Psychological Science (01), 48–51. https://doi.org/10.16719/j.cnki.1671-6981.1997.01.012 (in Chinese).
Zhang, & Zhang Jijia. (2006). On the six-factor model of personality trait description. Psychological Science, 03, 755–756. https://doi.org/10.16719/j.cnki.1671-6981.2006.03.062(in Chinese).
Zhang, J. X., & Zhou, M. J. (2006). Searching for a personality structure of Chinese: A theoretical hypothesis of a six factor model of personality traits. Advances in Psychological Science, 14(4), 574–585.
Zhou, M., Li, F., Mu, W., Fan, W., Zhang, J., & Zhang, M. (2023). Round outside and Square Inside: The latent Profile structure and adaptability of Chinese interpersonal relatedness. Acta Psychologicasinica, 55(3), 390.
Funding
The work was supported by the South China Normal University. The project name is “Network Personality Trait Model Construction and its Educational Application Research”. The project number is 21JXKA11.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Human and Animal Rights
This article does not contain any studies with human or animal subjects performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Conflict of interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, Y., Zheng, F., Xu, L. et al. A Richer Vocabulary of Chinese Personality Traits: Leveraging Word Embedding Technology for Mining Personality Descriptors. J Psycholinguist Res 53, 33 (2024). https://doi.org/10.1007/s10936-024-10060-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s10936-024-10060-1