Abstract
Word affective ratings are important tools in psycholinguistic research, natural language processing, and many other fields. However, even for well-studied languages, such norms are usually limited in scale. To extrapolate affective (i.e., valence and arousal) values for words in the SUBTLEX-CH database (Cai & Brysbaert, 2010, PLoS ONE, 5(6):e10729), we implemented a computational neural network which captured how words’ vector-based semantic representations corresponded to the probability densities of their valence and arousal. Based on these probability density functions, we predicted not only a word’s affective values, but also their respective degrees of variability that could characterize individual differences in human affective ratings. The resulting estimates of affective values largely converged with human ratings for both valence and arousal, and the estimated degrees of variability also captured important features of the variability in human ratings. We released the extrapolated affective values, together with their corresponding degrees of variability, for over 38,000 Chinese words in the Open Science Framework (https://osf.io/s9zmd/). We also discussed how the view of embodied cognition could be illuminated by this computational model.
Similar content being viewed by others
Data Availability
The database generated in the present study is available at https://osf.io/s9zmd/.
Notes
We utilized MacBERT-Large as it is the top-performing language model in the Chinese BERT series (Cui et al., 2020). The pretrained model can be obtained from https://github.com/ymcui/MacBERT or accessed through the Hugging Face Python library.
The large corpus consisted of text in both formal and colloquial language. The most substantial portion of text came from the Baidu Baike corpus. In addition, text in the 2019 dump of Wikipedia, News corpus, Weibo, as well as the two widely used corpora, The Lancaster Corpus of Mandarin Chinese (McEnery & Xiao, 2004) and the one issued by the National Language Commission, was also included.
Because Chinese BERT is character-based, the token embedding was generated by averaging those of the character components.
As words have different variabilities along the z.valence.mean and z.arousal.mean scale, this stratification process could not guarantee that a given word was assigned to the same set to train and validate the two separate models for valence and arousal.
The nearest neighbors were identified on the basis of the cosine similarity between BERT type embeddings. To place the models on an even ground for comparison, the baseline model was evaluated using the same set of words as MDNs across the five folds, where the predicted values (i.e., z.valence.mean and z.arousal.mean) of a word in the validation set corresponded to the average value of its k-nearest neighbors in the training set. Given that the predictive performance depends greatly on the number of nearest neighbors, we tested the baseline model by setting the parameter k to 5, 10, 20, 30, 50, 100, 200, and 300. Through five-fold cross-validation, the optimal setting was obtained at k = 20, where the predicted values showed a correlation of 0.870 and 0.757 for z.valence.mean and z.arousal.mean, respectively. These results were reported in Tables 3 and 4 to compare with the predictions based on MDNs.
For a target word in the validation set, we looked for its k-nearest neighbors (k = 20) in the training set based on the cosine similarity between BERT type embeddings. We then took the SDs of their affective values (i.e., z.valence.mean or z.arousal.mean) as the computed variability.
We thank an anonymous reviewer for making this point.
In light of the compositional semantics (Choi & Cardie, 2008; Moilanen & Pulman, 2008), we had also taken a hybrid approach whereby the affective values collected for the constituent characters (Peng et al., 2023) had been incorporated into the current computational model. However, the ablation study showed that the hybrid model performed almost on par with the one based solely on word embeddings, indicating that the contribution of the characters’ affective values was limited, even though they could explain some variance in the words’ affective values and generate decent predictions. For the sake of simplicity, we reported the computational model that was implemented based purely on word embeddings.
Pretrained model was obtained from https://code.google.com/archive/p/word2vec/.
References
Antoniak, M., & Mimno, D. (2018). Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics, 6, 107–119.
Bestgen, Y., & Vincze, N. (2012). Checking and bootstrapping lexical norms by means of word similarity indexes. Behavior Research Methods, 44(4), 998–1006.
Binder, J. R., Conant, L. L., Humphries, C. J., Fernandino, L., Simons, S. B., Aguilar, M., & Desai, R. H. (2016). Toward a brain-based componential semantic representation. Cognitive Neuropsychology, 33(3–4), 130–174.
Bishop, C. M. (1994). Mixture density network (Technical Report No. NCRG/94/004). Birmingham, UK: Aston University, Neural Computing Research Group.
Bommasani, R., Davis, K., & Cardie, C. (2020). Interpreting pretrained contextualized representations via reductions to static embeddings. In: Paper presented at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 4758–4781). Online.
Bradley, M. M., & Lang, P. J. (1999). Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings (Technical Report No. C-1). Gainesville, USA: University of Florida, NIMH Center for Research in Psychophysiology.
Cai, Q., & Brysbaert, M. (2010). SUBTLEX-CH: Chinese word and character frequencies based on film subtitles. PLoS ONE, 5(6), e10729.
Calderon-Delgado, L., Barrera-Valencia, M., Noriega, I., Al-Khalil, K., Trejos-Castillo, E., Mosi, J., Chavez, B., Galvan, M., & O’Boyle, M. W. (2020). Implicit processing of emotional words by children with post-traumatic stress disorder: An fMRI investigation. International Journal of Clinical and Health Psychology, 20(1), 46–53.
Calvo, R. A., & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18–37.
Chersoni, E., Santus, E., Huang, C. R., & Lenci, A. (2021). Decoding word embeddings with brain-based semantic features. Computational Linguistics, 47(3), 663–698.
Choi, Y., & Cardie, C. (2008). Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: Paper presented at the Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (pp. 793–801). Honolulu, USA.
Citron, F. M., Weekes, B. S., & Ferstl, E. C. (2014). Arousal and emotional valence interact in written word recognition. Language, Cognition and Neuroscience, 29(10), 1257–1267.
Ćoso, B., Guasch, M., Ferré, P., & Hinojosa, J. A. (2019). Affective and concreteness norms for 3,022 Croatian words. Quarterly Journal of Experimental Psychology, 72(9), 2302–2312.
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. (2020). Revisiting pre-trained models for Chinese natural language processing. In: Paper presented at the Findings of the Association for Computational Linguistics: Empirical Methods in Natural Language Processing 2020 (pp. 657–668). Online.
De Deyne, S., & Storms, G. (2008). Word associations: Network and semantic properties. Behavior Research Methods, 40(1), 213–231.
De Deyne, S., & Storms, G. (2008). Word associations: Norms for 1,424 Dutch words in a continuous task. Behavior Research Methods, 40(1), 198–205.
De Deyne, S., Navarro, D. J., & Storms, G. (2013). Better explanations of lexical and semantic cognition using networks derived from continued rather than single-word associations. Behavior Research Methods, 45(2), 480–498.
De Deyne, S., Verheyen, S., & Storms, G. (2015). The role of corpus size and syntax in deriving lexico-semantic representations for a wide range of concepts. Quarterly Journal of Experimental Psychology, 68(8), 1643–1664.
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M., & Storms, G. (2019). The “Small World of Words” English word association norms for over 12,000 cue words. Behavior Research Methods, 51(3), 987–1006.
Deese, J. (1966). The structure of associations in language and thought. Johns Hopkins University Press.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pretraining of deep bidirectional transformers for language understanding. In: Paper presented at the Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 4171–4186). Minneapolis, USA.
Endres, M. J., & Fein, G. (2013). Emotion-word processing difficulties in abstinent alcoholics with and without lifetime externalizing disorders. Alcoholism: Clinical and Experimental Research, 37(5), 831–838.
Ethayarajh, K. (2019). How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. In: Paper presented at the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (pp. 55–65). Hong Kong, China.
Fazio, R. H. (2001). On the automatic activation of associated evaluations: An overview. Cognition & Emotion, 15(2), 115–141.
Fraga, I., Guasch, M., Haro, J., Padrón, I., & Ferré, P. (2018). EmoFinder: The meeting point for Spanish emotional words. Behavior Research Methods, 50(1), 84–93.
Grandy, T. H., Lindenberger, U., & Schmiedek, F. (2020). Vampires and nurses are rated differently by younger and older adults—Age-comparative norms of imageability and emotionality for about 2500 German nouns. Behavior Research Methods, 52(3), 980–989.
Günther, F., Rinaldi, L., & Marelli, M. (2019). Vector-space models of semantic representation from a cognitive perspective: A discussion of common misconceptions. Perspectives on Psychological Science, 14(6), 1006–1033.
Günther, F., Petilli, M. A., Vergallito, A., & Marelli, M. (2022). Images of the unseen: Extrapolating visual representations for abstract and concrete words in a data-driven computational model. Psychological Research, 86(8), 2512–2532.
Hinojosa, J. A., Moreno, E. M., & Ferre, P. (2020). Affective neurolinguistics: Towards a framework for reconciling language and emotion. Language, Cognition and Neuroscience, 35(7), 813–839.
Hollis, G. (2017). Estimating the average need of semantic knowledge from distributional semantic models. Memory & Cognition, 45(8), 1350–1370.
Hollis, G., Westbury, C., & Lefsrud, L. (2017). Extrapolating human judgments from skip-gram vector representations of word meaning. Quarterly Journal of Experimental Psychology, 70(8), 1603–1619.
Humphreys, G. F., Hoffman, P., Visser, M., Binney, R. J., & Ralph, M. A. L. (2015). Establishing task- and modality-dependent dissociations between the semantic and default mode networks. Proceedings of the National Academy of Sciences, 112(25), 7857–7862.
Imbir, K. K. (2021). Affective Norms for 4900 Polish Words Reload (ANPW_R): Assessments for valence, arousal, dominance, origin, significance, concreteness, imageability and age of acquisition. Frontiers in Psychology, 12(7), 1081–2016.
Inohara, K., & Utsumi, A. (2022). JWSAN: Japanese word similarity and association norm. Language Resources and Evaluation, 56(1), 109–137.
Islam, M. R., & Zibran, M. F. (2018). SentiStrength-SE: Exploiting domain specificity for improved sentiment analysis in software engineering text. Journal of Systems and Software, 145, 125–146.
Kapucu, A., Kılıç, A., Ӧzkılıç, Y., & Sarıbaz, B. (2021). Turkish emotional word norms for arousal, valence, and discrete emotion categories. Psychological Reports, 124(1), 188–209.
Kousta, S. T., Vinson, D. P., & Vigliocco, G. (2009). Emotion words, regardless of polarity, have a processing advantage over neutral words. Cognition, 112(3), 473–481.
Kousta, S. T., Vigliocco, G., Vinson, D. P., Andrews, M., & Del Campo, E. (2011). The representation of abstract words: Why emotion matters. Journal of Experimental Psychology: General, 140(1), 14–34.
Kuperman, V., Estes, Z., Brysbaert, M., & Warriner, A. B. (2014). Emotion and language: Valence and arousal affect word recognition. Journal of Experimental Psychology: General, 143(3), 1065–1081.
Lahl, O., Gӧritz, A. S., Pietrowsky, R., & Rosenberg, J. (2009). Using the World-Wide Web to obtain large-scale word norms: 190,212 ratings on a set of 2,654 German nouns. Behavior Research Methods, 41(1), 13–19.
Lambon Ralph, M. A., Jefferies, E., Patterson, K., & Rogers, T. T. (2017). The neural and computational bases of semantic cognition. Nature Reviews Neuroscience, 18(1), 42–55.
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240.
Lenci, A., Lebani, G. E., & Passaro, L. C. (2018). The emotions of abstract words: A distributional semantic analysis. Topics in Cognitive Science, 10(3), 550–572.
Lenci, A., Sahlgren, M., Jeuniaux, P., Gyllensten, A. C., & Miliani, M. (2022). A comparative evaluation and analysis of three generations of distributional semantic models. Language Resources and Evaluation, 56(4), 1269–1313.
Li, S., Zhao, Z., Hu, R., Li, W., Liu, T., & Du, X. (2018). Analogical reasoning on Chinese morphological and semantic relations. In: Paper presented at the Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (pp. 138–143). Melbourne, Australia.
Liu, P., Lu, Q., Zhang, Z., Tang, J., & Han, B. (2021). Age-related differences in affective norms for Chinese words (AANC). Frontiers in Psychology, 12, 585666.
Mandera, P., Keuleers, E., & Brysbaert, M. (2015). How useful are corpus-based methods for extrapolating psycholinguistic variables? Quarterly Journal of Experimental Psychology, 68(8), 1623–1642.
Mandera, P., Keuleers, E., & Brysbaert, M. (2017). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language, 92, 57–78.
Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. MIT Press.
Martin, C. B., Douglas, D., Newsome, R. N., Man, L. L. Y., & Barense, M. D. (2018). Integrative and distinctive coding of visual and conceptual object features in the ventral visual stream. eLife, 7, e31873.
Martínez-Huertas, J. A., Jorge-Botana, G., Luzón, J. M., & Olmos, R. (2021). Redundancy, isomorphism, and propagative mechanisms between emotional and amodal representations of words: A computational study. Memory & Cognition, 49(2), 219–234.
McEnery, A., & Xiao, Z. (2004). The Lancaster Corpus of Mandarin Chinese: A Corpus for monolingual and contrastive language study. In: Paper presented at the Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ArXiv:1301.3781.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. (1990). WordNet: An on-line lexical database. International Journal of Lexicography, 3, 235–244.
Moilanen, K., & Pulman, S. (2008). The good, the bad, and the unknown: Morphosyllabic sentiment tagging of unseen words. In: Paper presented at the Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies (pp. 109–112). Columbus, USA.
Monnier, C., & Syssau, A. (2014). Affective norms for French words (FAN). Behavior Research Methods, 46(4), 1128–1137.
Montefinese, M., Ambrosini, E., Fairfield, B., & Mammarella, N. (2014). The adaptation of the Affective Norms for English Words (ANEW) for Italian. Behavior Research Methods, 46(3), 887–903.
Moors, A., De Houwer, J., Hermans, D., Wanmaker, S., van Schie, K., van Harmelen, A. L., De Schryver, M., De Winne, J., & Brysbaert, M. (2013). Norms of valence, arousal, dominance, and age of acquisition for 4,300 Dutch words. Behavior Research Methods, 45(1), 169–177.
Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. University of Illinois Press.
Patterson, K., Nestor, P. J., & Rogers, T. T. (2007). Where do you know what you know? The representation of semantic knowledge in the human brain. Nature Reviews Neuroscience, 8(12), 976–987.
Peng, C., Xu, X., & Bao, Z. (2023). Sentiment annotations for 3,827 simplified Chinese characters. Behavioral Research Methods.
Petilli, M. A., Günther, F., Vergallito, A., Ciapparelli, M., & Marelli, M. (2021). Data-driven computational models reveal perceptual simulation in word processing. Journal of Memory and Language, 117, 104194.
Plaut, D. C., & Booth, J. R. (2000). Individual and developmental differences in semantic priming: Empirical and computational support for a single-mechanism account of lexical processing. Psychological Review, 107(4), 786–823.
Pobric, G., Jefferies, E., & Lambon Ralph, M. A. (2007). Anterior temporal lobes mediate semantic representation: Mimicking semantic dementia by using rTMS in normal participants. Proceedings of the National Academy of Sciences, 104(50), 20137–20141.
Pollock, L. (2018). Statistical and methodological problems with concreteness and other semantic variables: A list memory experiment case study. Behavior Research Methods, 50(3), 1198–1216.
Qiu, Y., Li, H., Li, S., Jiang, Y., Hu, R., & Yang, L. (2018). Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings. In M. Sun, T. Liu, X. Wang, Z. Liu, & Y. Liu (Eds.), Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (pp. 209–221). Springer.
Reagan, A. J., Danforth, C. M., Tivnan, B., Williams, J. R., & Dodds, P. S. (2017). Sentiment analysis methods for understanding large-scale texts: A case for using continuum-scored words and word shift graphs. EPJ Data Science, 6, 28.
Recchia, G., & Louwerse, M. M. (2015). Reproducing affective norms with lexical co-occurrence statistics: Predicting valence, arousal, and dominance. Quarterly Journal of Experimental Psychology, 68(8), 1584–1598.
Riegel, M., Wierzba, M., Wypych, M., Żurawski, Ł, Jednoróg, K., Grabowska, A., & Marchewka, A. (2015). Nencki Affective Word List (NAWL): the cultural adaptation of the Berlin Affective Word List-Reloaded (BAWL-R) for Polish. Behavior Research Methods, 47(4), 1222–1236.
Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178.
Sommerauer, P., & Fokkens, A. (2018). Firearms and tigers are dangerous, kitchen knives and zebras are not: Testing whether word embeddings can tell. In: Paper presented at the Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP (pp. 276–286). Brussels, Belgium.
Stadthagen-Gonzalez, H., Imbault, C., Sánchez, M. A. P., & Brysbaert, M. (2017). Norms of valence and arousal for 14,031 Spanish words. Behavior Research Methods, 49(1), 111–123.
Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87(2), 245–251.
Szalay, L. B., & Deese, J. (1978). Subjective meaning and culture: An assessment through word associations. Lawrence Erlbaum.
Tsang, Y. K., Huang, J., Lui, M., Xue, M., Chan, Y. W. F., Wang, S., & Chen, H. C. (2018). MELD-SCH: A megastudy of lexical decision in simplified Chinese. Behavior Research Methods, 50(5), 1763–1777.
Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems, 21(4), 315–346.
Utsumi, A. (2020). Exploring what is encoded in distributional word vectors: A neurobiologically motivated analysis. Cognitive Science, 44(6), e12844.
Van Rensbergen, B., Storms, G., & De Deyne, S. (2015). Examining assortativity in the mental lexicon: Evidence from word associations. Psychonomic Bulletin & Review, 22(6), 1717–1724.
Van Rensbergen, B., De Deyne, S., & Storms, G. (2016). Estimating affective word covariates using word association data. Behavior Research Methods, 48(4), 1644–1652.
Verona, E., Sprague, J., & Sadeh, N. (2012). Inhibitory control and negative emotional processing in psychopathy and antisocial personality disorder. Journal of Abnormal Psychology, 121(2), 498–510.
Võ, M. L. H., Conrad, M., Kuchinke, L., Urton, K., Hofmann, M. J., & Jacobs, A. M. (2009). The Berlin Affective Word List Reloaded (BAWL-R). Behavior Research Methods, 41(2), 534–538.
Vulić, I., Ponti, E. M., Litschko, R., Glavaš, G., & Korhonen, A. (2020). Probing pretrained language models for lexical semantics. In: Paper presented at the Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 7222–7240). Online.
Wang, Y., Zhou, L., & Luo, Y. (2008). The pilot establishment and evaluation of Chinese affective words system. Chinese Mental Health Journal, 22(8), 608–612.
Wang, X., Wu, W., Ling, Z., Xu, Y., Fang, Y., Wang, X., Binder, J. R., Men, W., Gao, J., & Bi, Y. (2018). Organizational principles of abstract words in the human brain. Cerebral Cortex, 28, 4305–4318.
Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods, 45(4), 1191–1207.
Wrobel, M. R. (2020). The impact of lexicon adaptation on the emotion mining from software engineering artifacts. IEEE Access, 8, 48742–48751.
Xu, X., & Li, J. (2020). Concreteness/abstractness ratings for two-character Chinese words in MELD-SCH. Plos One, 15(6), e0232133.
Xu, X., Li, J., & Guo, S. (2021). Age of acquisition ratings for 19,716 simplified Chinese words. Behavior Research Methods, 53, 558–573.
Xu, X., Li, J., & Chen, H. (2022). Valence and arousal ratings for 11,310 simplified Chinese words. Behavior Research Methods, 54, 26–41.
Acknowledgements
The authors would like to thank Dr. Xurong Xie and Dr. Rongfeng Su for discussing the implementation of the computational model. The computations were performed using research computing facilities offered by Information Technology Services, the University of Hong Kong. This study is supported by research grants awarded to the corresponding author by Shanghai Jiao Tong University (WF220414005). We would also like to express our gratitude to two anonymous reviewers for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Practices Statement
The extrapolated database can be accessed from the Open Science Framework repository. The study was not preregistered.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, T., Xu, X. The good, the bad, and the ambivalent: Extrapolating affective values for 38,000+ Chinese words via a computational model. Behav Res (2023). https://doi.org/10.3758/s13428-023-02274-3
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-023-02274-3