Skip to main content
Log in

A Richer Vocabulary of Chinese Personality Traits: Leveraging Word Embedding Technology for Mining Personality Descriptors

  • Original Research
  • Published:
Journal of Psycholinguistic Research Aims and scope Submit manuscript

Abstract

This study uses a data-driven approach to mine the distribution of personality traits among Chinese people in the Chinese social context. Based on the hypothesis of personality lexicology, word embedding technology was employed in machine learning to mine personality vocabulary from Tencent’s word embedding database. More than 10,000 Chinese personality descriptors were extracted and analyzed using Gaussian Mixture Model Cluster and Hierarchical clustering analysis. The data was collected from 658 Chinese people randomly from all parts of China through an online questionnaire method. The results reveal six personality traits in the Chinese context, expanding the personality thesaurus and providing examples to illustrate each trait. The findings coincide with previous research on the five-factor model, which partially describes the personality traits of Chinese people, but does not offer a complete explanation of their typical social behavior patterns. Additionally, the study supports the notion of cultural particularity in personality traits. The approach used in this study offers a richer personality vocabulary than traditional personality mining methods, and word embedding technology captures richer semantic information in Chinese. The six Chinese personality traits identified in this study will also be used to explore how to quantify and evaluate personality traits based on word embedding and personality descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  • Agbaria, Q., & Mokh, A. A. (2022). Coping with stress during the coronavirus outbreak: The contribution of big five personality traits and social support. International Journal of Mental Health and Addiction, 20(3), 1854–1872. https://doi.org/10.1007/s11469-021-00486-2.

    Article  PubMed  Google Scholar 

  • Alderotti, G., Rapallini, C., & Traverso, S. (2023). The big five personality traits and earnings: A meta-analysis. Journal of Economic Psychology, 94, 102570.

    Article  Google Scholar 

  • Arslan, E. Y., Yildirim, O., Kaynas, T., & Atanasov, K. (2023). Exploring the impact of Digitalized Learning and Teaching systems on the big five personality traits. Multidimensional and Strategic Outlook in Digital Business Transformation: Human resource and management recommendations for performance improvement (pp. 165–176). Springer International Publishing.

  • Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11, 150–166. https://doi.org/10.1177/1088868306294907.

    Article  PubMed  Google Scholar 

  • Ashton, M. C., & Lee, K. (2020). Objections to the HEXACO model of personality structure—and why those objections fail. European Journal of Personality, 34(4), 492–510. https://doi.org/10.1002/per.2242.

    Article  Google Scholar 

  • Bertsch, A., Ondracek, J., Saeed, M., Hulm, J., Borud, D., McCloud, M., & Jisheng, L. (2021). Exploring similarities and differences in big 5 personality traits of students’ declared University Major at a Regional USA University. Delhi Business Review, 22(2), 59–74.

    Article  Google Scholar 

  • Busseri, M. A., & Erb, E. M. (2023). The happy personality revisited: Re-examining associations between big five personality traits and subjective well‐being using meta‐analytic structural equation modeling. Journal of Personality.

  • Church, A. T. (2016). Personality traits across cultures. Current Opinion in Psychology, 8, 22–30. https://doi.org/10.1016/j.copsyc.2015.09.014.

    Article  PubMed  Google Scholar 

  • Cutler, A., & Condon, D. M. (2023). Deep lexical hypothesis: Identifying personality structure in natural language. Journal of Personality and Social Psychology, 125(1), 173.

    Article  PubMed  Google Scholar 

  • Deho, B. O., Agangiba, A. W., Aryeh, L. F., & Ansah, A. J. (2018). Sentiment analysis with word embedding. In 2018 IEEE 7th International Conference on Adaptive Science & Technology (ICAST) (pp. 1–4). IEEE. https://doi.org/10.1109/ICASTECH.2018.8506717.

  • Feher, A., & Vernon, P. A. (2021). Looking beyond the big five: A selective review of alternatives to the big five model of personality. Personality and Individual Differences, 169, 110002. https://doi.org/10.1016/j.paid.2020.110002.

    Article  Google Scholar 

  • Figueroa, A., Ghosh, S., & Aragon, C. (2023). Generalized Cohen’s kappa: a novel inter-rater reliability metric for non-mutually exclusive categories. In International Conference on Human-Computer Interaction (pp. 19–34). Cham: Springer Nature Switzerland.

  • Govender, P., & Sivakumar, V. (2020). Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmospheric Pollution Research, 11(1), 40–56. https://doi.org/10.1016/j.apr.2019.09.009.

    Article  Google Scholar 

  • Guo, Y. J. (2010). The development of moral personality Vocabulary rating Scale. Zhengzhou University. https://kns.cnki.net/KCMS/detail/detail.aspx? dbname = CMFD2011&filename = 2011010882.nh (in Chinese).

  • Guo, H., Ma, J., & Ma, Z. (2018). Active Semi-supervised K-Means Clustering Based on Silhouette Coefficient. In International Conference on Intelligent and Interactive Systems and Applications (pp. 202–209). Springer, Cham. https://doi.org/10.1007/978-3-030-02804-6_27.

  • He, M., Ma, C., & Wang, R. (2022). A Data-Driven Approach for University Public Opinion Analysis and Its Applications. Applied Sciences, 12(18), 9136.https://doi.org/10.3390/app12189136.

  • Hinton, G. E. (1986). Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society (Vol. 1, p. 12). http://www.cs.toronto.edu/~hinton/absps/families.pdf.

  • Huang Xiting (2014). Exploring the mystery of personality. Beijing: The Commercial Press ISBN: 978-7-100-10320-6. Retrieved from: https://item.jd.com/10034222676742.html (in Chinese)

  • Jiao, L., Denœux, T., Liu, Z. G., & Pan, Q. (2022). EGMM: An evidential version of the Gaussian mixture model for clustering. Applied Soft Computing, 129, 109619. https://doi.org/10.1016/j.asoc.2022.109619.

    Article  Google Scholar 

  • Kachur, A., Osin, E., Davydov, D., Shutilov, K., & Novokshonov, A. (2020). Assessing the big five personality traits using real-life static facial images. Scientific Reports, 10(1), 1–11. https://doi.org/10.1038/s41598-020-65358-6.

    Article  Google Scholar 

  • Kajonius, P., & Mac Giolla, E. (2017). Personality traits across countries: Support for similarities rather than differences. PloS One, 12(6), e0179646. https://doi.org/10.1371/journal.pone.0179646.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kim, L. E., Jörg, V., & Klassen, R. M. (2019). A meta-analysis of the effects of teacher personality on teacher effectiveness and burnout. Educational Psychology Review, 31(1), 163–195. https://doi.org/10.1007/s10648-018-9458-2.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lee, K., & Ashton, M. C. (2020). HEXACO Model of personality. The Wiley Encyclopedia of Personality and Individual Differences: Models and Theories, 249–256. https://doi.org/10.1002/9781119547143.ch42.

  • Li, K., Ma, Z., Robinson, D., & Ma, J. (2018). Identification of typical building daily electricity usage profiles using Gaussian mixture model-based clustering and hierarchical clustering. Applied Energy, 231, 331–342. https://doi.org/10.1016/j.apenergy.2018.09.050.

    Article  Google Scholar 

  • Li, N., Sun, D., & Wang, S. (2023). Semantic preview effect of relatedness and plausibility in reading Chinese: Evidence from high constraint sentences. Reading and Writing, 36(5), 1319–1338.

    Article  Google Scholar 

  • Luo, X., Chen, H. H., & Guo, Q. (2022). Semantic communications: Overview, open issues, and future research directions. IEEE Wireless Communications, 29(1), 210–219.

    Article  Google Scholar 

  • Mu, & Gu Haigen. (2010). The development of virtue adjective rating scale and its reliability and validity. Chinese Journal of Clinical Psychology, 03, 310–313. https://doi.org/10.16128/j.cnki.1005-3611.2010.03.015(in Chinese).

    Article  Google Scholar 

  • Pan, R. (2023). Automatic Keyword Extraction Algorithm for Chinese Text based on Word Clustering. ACM Transactions on Asian and Low-Resource Language Information Processing.

  • Purvin, L. A. (2001). Personality science. East China Normal University Press.http://xidong.net/File001/File_53993.html.

  • S Harris, Z. (1954). Distributional structure. Word, 10(2-3), 146–162. https://doi.org/10.1080/00437956.1954.11659520.

    Article  Google Scholar 

  • Sagadevan, S., Malim, N. H. A. H., & Husin, M. H. (2022). A seed-guided latent Dirichlet Allocation Approach to predict the personality of online users using the PEN Model. Algorithms, 15(3), 87.

    Article  Google Scholar 

  • Saputro, D. R. S. (2022). Algoritme partitioning around medoid (pam) dengancalinski-harabasz index untuk clustering data outlier. UNEJ e-Proceeding, 22–29. Available at: https://jurnal.unej.ac.id/index.php/prosiding/article/view/33490.

  • Saucier, G. (2002). Gone too far - or not far enough? Comments on the article by ashton and lee (2001). European Journal of Personality, 16(1), 55–62. https://doi.org/10.1002/per.432.

    Article  Google Scholar 

  • Saucier, G., & Srivastava, S. (2015). What makes a good structural model of personality? Evaluating the big five and alternatives. In M. Mikulincer, P. R. Shaver, M. L. Cooper, & R. J. Larsen (Eds.), APA handbook of personality and social psychology (Vol. 4, pp. 283–305). American Psychological Association. Personality processes and individual differenceshttps://doi.org/10.1037/14343-013.

  • Smith, M. M., Sherry, S. B., Vidovic, V., Saklofske, D. H., Stoeber, J., & Benoit, A. (2019). Perfectionism and the five-factor model of personality: A meta-analytic review. Personality and Social Psychology Review, 23(4), 367–390. https://doi.org/10.1177/1088868318814973.

    Article  PubMed  Google Scholar 

  • Song, Y., Shi, S., Li, J., & Zhang, H. (2018). Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2(Short Papers) (pp. 175–180). https://aclanthology.org/N18-2028.

  • Steppan, M. (2020). Personality adjectives in British and American English from 1800–2010. How to trace back historical trends in personality. https://doi.org/10.31234/osf.io/tpq6j.

  • Torregrossa, F., Allesiardo, R., Claveau, V., Kooli, N., & Gravier, G. (2021). A survey on training and evaluation of word embeddings. International Journal of Data Science and Analytics, 11(2), 85–103. https://doi.org/10.1007/s41060-021-00242-8.

    Article  Google Scholar 

  • Wang, D., & Cui, H. (2019). Chinese personality: Structure and measurement. In Progress in Psychological Science around the World (pp. 105–121). Routledge. ISBN 9781315793184.

  • Wang, H., & Zhang, Y. (2022). The effects of personality traits and attitudes towards the rule on academic dishonesty among university students. Scientific Reports, 12(1), 1–7. https://doi.org/10.1038/s41598-022-18394-3.

    Article  Google Scholar 

  • Wang, B., Wang, A., Chen, F., Wang, Y., & KuoC. C. J. (2019). Evaluating word embedding models: Methods and experimental results. APSIPA Transactions on Signal and Information Processing, 8, https://doi.org/10.1017/ATSIP.2019.12.

  • Wang, S., Zhou, W., & Jiang, C. (2020). A survey of word embeddings based on deep learning. Computing, 102(3), 717–740. https://doi.org/10.1007/s00607-019-00768-7.

    Article  Google Scholar 

  • Wevers, M., & Koolen, M. (2020). Digital begriffsgeschichte: Tracing semantic change using word embeddings. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 53(4), 226–243. https://doi.org/10.1080/01615440.2020.1760157.

    Article  Google Scholar 

  • Wu, C., Peng, Q., Lee, J., Leibnitz, K., & Xia, Y. (2021). Effective hierarchical clustering based on structural similarities in nearest neighbor graphs. Knowledge-Based Systems, 228, 107295. https://doi.org/10.1016/j.knosys.2021.107295.

    Article  Google Scholar 

  • Xiao, Y., Keung, J., Bennin, K. E., & Mi, Q. (2019). Improving bug localization with word embedding and enhanced convolutional neural networks. Information and Software Technology, 105, 17–29. https://doi.org/10.1016/j.infsof.2018.08.002.

    Article  Google Scholar 

  • Xu, Y., & Wang, P. P. (2011). Verb-based analysis to explore the Chinese model of personality structure. In meeting of the 90th Anniversary Conference of the Chinese Psychological Society and the 14th National Psychological Conference, Beijing, China.(in Chinese).

  • Zettler, I., Thielmann, I., Hilbig, B. E., & Moshagen, M. (2020). The nomological net of the HEXACO model of personality: A large-scale meta-analytic investigation. Perspectives on Psychological Science, 15(3), 723–760. https://doi.org/10.1177/1745691619895036.

    Article  PubMed  Google Scholar 

  • Zhang, Z., & Wang Dengfeng. (1997). &. On the big-seven factor model of personality trait descriptors. Journal of Psychological Science (01), 48–51. https://doi.org/10.16719/j.cnki.1671-6981.1997.01.012 (in Chinese).

  • Zhang, & Zhang Jijia. (2006). On the six-factor model of personality trait description. Psychological Science, 03, 755–756. https://doi.org/10.16719/j.cnki.1671-6981.2006.03.062(in Chinese).

    Article  Google Scholar 

  • Zhang, J. X., & Zhou, M. J. (2006). Searching for a personality structure of Chinese: A theoretical hypothesis of a six factor model of personality traits. Advances in Psychological Science, 14(4), 574–585.

    Google Scholar 

  • Zhou, M., Li, F., Mu, W., Fan, W., Zhang, J., & Zhang, M. (2023). Round outside and Square Inside: The latent Profile structure and adaptability of Chinese interpersonal relatedness. Acta Psychologicasinica, 55(3), 390.

    Article  Google Scholar 

Download references

Funding

The work was supported by the South China Normal University. The project name is “Network Personality Trait Model Construction and its Educational Application Research”. The project number is 21JXKA11.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feijun Zheng.

Ethics declarations

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Conflict of interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, Y., Zheng, F., Xu, L. et al. A Richer Vocabulary of Chinese Personality Traits: Leveraging Word Embedding Technology for Mining Personality Descriptors. J Psycholinguist Res 53, 33 (2024). https://doi.org/10.1007/s10936-024-10060-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10936-024-10060-1

Keywords

Navigation