Skip to main content
Log in

BERTs of a feather: Studying inter- and intra-group communication via information theory and language models

  • Original Manuscript
  • Published:
Behavior Research Methods Aims and scope Submit manuscript

Abstract

When communicating, individuals alter their language to fulfill a myriad of social functions. In particular, linguistic convergence and divergence are fundamental in establishing and maintaining group identity. Quantitatively characterizing linguistic convergence is important when testing hypotheses surrounding language, including interpersonal and group communication. We provide a quantitative interpretation of linguistic convergence grounded in information theory. We then construct a computational model, built on top of a neural network model of language, that can be deployed to measure and test hypotheses about linguistic convergence in “big data.” We demonstrate the utility of our convergence measurement in two case studies: (1) showing that our measurement is indeed sensitive to linguistic convergence across turns in dyadic conversation, and (2) showing that our convergence measurement is sensitive to social factors that mediate convergence in Internet-based communities (specifically, r/MensRights and r/MensLib). Our measurement also captures differences in which social factors influence web-based communities. We conclude by discussing methodological and theoretical implications of this semantic convergence analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. However if we repeated this process for each word, the probability for each element of any utterance x based on a comparative sample can be thought of in terms of a multinomial distribution. For simplicity, we describe the sampling process in binary terms (yes/no between some x and some set y), and thus focus on the Bernoulli distribution for the rest of this paper.

  2. Entropy also decreases as two distributions become increasingly polar. If the probability of some term i is 1 in distribution a but the probability of i is 0 in distribution b the entropy of P(i|a) and P(i|b) is 0. This is because you can easily predict i in a by knowing that i in a is the opposite of what you have observed for i in b. Realistically, it is not clear how this condition could be met in language, however, and research shows that the language models that we will use to estimate entropy have a baseline similarity between any two randomly sampled word vectors \(> 0. \) and \(< 1.\) Ethayarajh (2019) thus rendering this condition impossible with our specific method.

  3. The method described in Rosen (2022) has two major differences from the current method. First, their measurement of convergence requires that the data contains samples from two groups with all individuals pre-labeled according to their group status in order to show that rhetoric is internally consistent within groups and inconsistent outside of them. Our current method does not require the presence of multiple groups in order to measure convergence. Second, their method requires the pre-selection of some set of key terms for analysis. Our method treats every word in an utterance as a unique experiment, and thus does not require any predefined lexicon in order to capture convergence.

  4. By no means is the use of a Gaussian Distribution the only way of converting a Cosine value to a probability. For example, one could go so far as to use \(\frac{1 + CoS(E_{xi}, E_{yj})}{2}\) to convert scalar Cosine Similarity (or CoS: which is the reciprocal of CoE) values to a ratio in terms of maximum similarity. We prefer the use of a Gaussian distribution here as a means of increasing the burden of proof required to claim two words mean the same thing based on the proximity of their word vectors, because of the way that the scale parameter \(\sigma \) can be used to increase penalties on word vectors that are dissimilar to one another.

  5. In truly unconstrained cases where one is comparing utterances to one another irrespective of interest in any one lexical item – i.e., comparing all sentences that invoke a specific phrase like “forced birth” – one should look for smaller sample sizes but a greater number of random samples taken in order to characterize the possible diversity of utterances in the data.

  6. https://github.com/saulalbert/CABNC

  7. The use of SBERT as a means of measuring semantic similarity (see: Alatawi, Sheth and Liu , 2023) has a major limitation however when compared to the method proposed here. Namely, SBERT is designed and trained to classify sentences with global semantic similarity-sentences that have been labeled as meaning the same thing, irrespective of their constituent components-as more similar to one another (Reimers & Gurevych, 2019). However, it is easy to imagine a case wherein individuals from a group, engaging with one another in ongoing discourse, should be expected to author utterances that imply wildly different meanings while leaning into group lexico-semantic norms for the subcomponents of their utterances.

References

  • Adams, A., Miles, J., Dunbar, N. E., & Giles, H. (2018). Communication accommodation in text messages: Exploring liking, power, and sex as predictors of textisms. The Journal of Social Psychology, 158(4), 474–490. https://doi.org/10.1080/00224545.2017.1421895

    Article  PubMed  Google Scholar 

  • Alatawi, F., Sheth, P., & Liu, H. (2023). Quantifying the echo chamber effect: An embedding distance based approach. Retrieved September 10, 2023, from arXiv:2307.04668

  • Albert, S., de Ruiter, L. E., & De Ruiter, J. (2015). Cabnc: The Jeffersonian transcription of the spoken British national corpus. CABNC: The Jeffersonian transcription of the Spoken British National Corpus.

  • Angus, D., Smith, A., & Wiles, J. (2012). Conceptual recurrence plots: revealing patterns in human discourse . IEEE Transactions on Visualization and Computer Graphics, 18(6), 988–997. https://doi.org/10.1109/TVCG.2011.100 [Conference Name: IEEE Transactions on Visualization and Computer Graphics]

  • Angus, D., Watson, B., Smith, A., Gallois, C., & Wiles, J. (2012). Visualising conversation structure across time: Insights into effective doctor-patient consultations. PloS one, 7(6), e38014.

    Article  PubMed  PubMed Central  Google Scholar 

  • Ba, L., & Zhao, W. G. W. (2021). Symbolic convergence or divergence? Making sense of (the Rhetorical) senses of a university-wide organizational change. Frontiers in Psychology, 12, 690757. https://doi.org/10.3389/fpsyg.2021.690757

    Article  PubMed  PubMed Central  Google Scholar 

  • Bäck, E. A., Bäck, H., Sendén, M. G., & Sikström, S. (2018). From I to we: Group formation and linguistic adaption in an online xenophobic forum [Number: 1]. Journal of Social and Political Psychology, 6(1), 76–91. https://doi.org/10.5964/jspp.v6i1.741

    Article  Google Scholar 

  • Bailey, B. (2000). Language and negotiation of ethnic/racial identity among Dominican Americans. Language in Society, 29(4), 555–582. https://doi.org/10.1017/S0047404500004036, [Publisher: Cambridge University Press]

  • Bamman, D., Eisenstein, J., & Schnoebelen, T. (2014). Gender identity and lexical variation in social media. Journal of Sociolinguistics, 18(2), 135–160. https://doi.org/10.1111/josl.12080, [_eprint: https://www.onlinelibrary.wiley.com/doi/pdf/10.1111/josl.12080]

  • Barker, J. O., & Rohde, J. A. (2019). Topic clustering of E-Cigarette submissions among reddit communities: a network perspective. Health Education & Behavior, 46(2_suppl), 59S–68S. https://doi.org/10.1177/1090198119863770 [Publisher: SAGE Publications Inc]

  • Bormann, E. G. (1982). The symbolic convergence theory of communication: Applications and implications for teachers and consultants. Journal of Applied Communication Research, 10(1), 50–61. https://doi.org/10.1080/00909888209365212

    Article  Google Scholar 

  • Bradac, J. J., Mulac, A., & House, A. (1988). Lexical diversity and magnitude of convergent versus divergent style shifting-: Perceptual and evaluative consequences. Language & Communication, 8(3), 213–228. https://doi.org/10.1016/0271-5309(88)90019-5

    Article  Google Scholar 

  • Branigan, H. P., Pickering, M. J., & Cleland, A. A. (2000). Syntactic co-ordination in dialogue. Cognition, 75(2), B13–B25. https://doi.org/10.1016/S0010-0277(99)00081-5

    Article  PubMed  Google Scholar 

  • Brennan, S. E., & Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology. Learning, Memory, and Cognition, 22 (6), 1482–1493. https://doi.org/10.1037//0278-7393.22.6.1482

  • Brennan, S., Galati, A., & Kuhlen, A. (2010). Two minds, one dialog: Coordinating speaking and understanding. The Psychology of Learning and Motivation: Advances in Research and Theory (pp. 301–344). Elsevier. https://doi.org/10.1016/C2009-0-62209-1

  • Cota, W., Ferreira, S. C., Pastor-Satorras, R., & Starnini, M. (2019). Quantifying echo chamber effects in information spreading over political communication networks [Number: 1 Publisher: SpringerOpen]. EPJ Data Science, 8(1), 1–13. https://doi.org/10.1140/epjds/s13688-019-0213-9

    Article  Google Scholar 

  • Dale, R., Duran, N. D., & Coco, M. (2018). Dynamic natural language processing with recurrence quantification analysis. Retrieved January 2, 2022, from arXiv:1803.07136

  • Dale, R., & Spivey, M. J. (2005). Categorical recurrence analysis of child language. Proceedings of the Annual Meeting of the Cognitive Science Society, 27(27), 530–535.

    Google Scholar 

  • Danescu-Niculescu-Mizil, C., Gamon, M., & Dumais, S. (2011). Mark my words!: Linguistic style accommodation in social media. Proceedings of the 20th international conference on World wide web - WWW ’11, 745. https://doi.org/10.1145/1963405.1963509

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. Retrieved March 31, 2021, from arXiv:1810.04805

  • DiBranco, A. (2020). The Men’ rights movement and violence - institute for research on male supremacism. Retrieved June 15, 2022, from https://www.malesupremacism.org/2020/07/21/the-mens-rightsmovement-and-violence/

  • Dougherty, D. S., Kramer, M. W., Klatzke, S. R., & Rogers, T. K. K. (2009). Language convergence and meaning divergence: A meaning centered communication theory. Communication Monographs, 76(1), 20–46. https://doi.org/10.1080/03637750802378799

    Article  Google Scholar 

  • Dougherty, D. S., Mobley, S. K., & Smith, S. E. (2010). Language convergence and meaning divergence: A theory of intercultural communication. Journal of International and Intercultural Communication, 3(2), 164–186. https://doi.org/10.1080/17513051003611628

    Article  Google Scholar 

  • Doyle, G., Goldberg, A., Srivastava, S., & Frank, M. (2017). Alignment at work: using language to distinguish the internalization and self-regulation components of cultural fit in organizations. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Vol. 1:Long Papers), 603–612. https://doi.org/10.18653/v1/P17-1056

  • Dragojevic, M., & Giles, H. (2014). Language and interpersonal communication: Their intergroup dynamics. In C. R. Berger (Ed.), Interpersonal Communication (pp. 29-51). DE GRUYTER. https://doi.org/10.1515/9783110276794.29

  • Dumais, S., Furnas, G., Landauer, T., Deerwester, S., & Harshman, R. (1996). Using latent semantic analysis to improve access to textual information. Proceedings, CHI, 88. https://doi.org/10.1145/57167.57214

  • Duran, N. D., Paxton, A., & Fusaroli, R. (2019). Align: Analyzing linguistic interactions with generalizable techniques-a python library. Psychological Methods, 24(4), 419.

    Article  PubMed  Google Scholar 

  • Ethayarajh, K. (2019). How contextual are contextualizedword representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. Retrieved March 19, 2023, from arXiv:1909.00512

  • Gallois, C., Ogay, T., & Giles, H. (2005). Communication accommodation theory: A look back and a look ahead. Theorizing About Intercultural Communication.

  • Gallois, C., Gasiorek, J., Giles, H., & Soliz, J. (2016). Communication accommodation theory: integrations and new framework developments. In H. Giles (Ed.), Communication Accommodation Theory (1st ed., pp. 192-210). Cambridge University Press. https://doi.org/10.1017/CBO9781316226537.010

  • Garimella, K., Morales, G. D. F., Gionis, A., & Mathioudakis, M. (2017). Quantifying controversy in social media. https://doi.org/10.48550/arXiv.1507.05224, arXiv:1507.05224

  • Garrod, S., & Anderson, A. (1987). Saying what you mean in dialogue: A study in conceptual and semantic co-ordination*. Cognition, 27,. https://doi.org/10.1016/0010-0277(87)90018-7

  • Giles, H., Willemyns, M., Gallois, C., & Anderson, M. C. (2007). Accommodating a new frontier: The context of law enforcement. Social Communication (p. 35). Psychology.

  • Goldstein, A., Zada, Z., Buchnik, E., Schain, M., Price, A., Aubrey, B., ... & Nastase, S. A. (2022). Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3), 369–380. https://doi.org/10.1038/s41593-022-01026-4

  • Hamilton, W. L., Clark, K., Leskovec, J., & Jurafsky, D. (2016). Inducing domain-specific sentiment lexicons from unlabeled corpora. Conference on Empirical Methods in Natural Language Processing, 2016, 595–605. https://doi.org/10.18653/v1/D16-1057

  • Hassan, S. A., & Shah, M. J. (2019). The anatomy of undue influence used by terrorist cults and traffickers to induce helplessness and trauma, so creating false identities. Ethics, Medicine and Public Health, 8, 97–107. https://doi.org/10.1016/j.jemep.2019.03.002

    Article  Google Scholar 

  • Hassan, S. (2017). Brainwashing young people into violent extremist cults. Freedom from Fear, 2017 (13), 18–22. https://doi.org/10.18356/d37f4d01-en [Publisher: United Nations]

  • Hilte, L. (2023). How is linguistic accommodation perceived in instant messaging? A survey on teenagers’ evaluations and perceptions. Journal of Language and Social Psychology, 0261927X2311671. https://doi.org/10.1177/0261927X231167108

  • Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., ... & Casas, D.d.L. (2022). Training Compute-Optimal Large Language Models. https://doi.org/10.48550/arXiv.2203.15556, arXiv:2203.15556

  • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366. https://doi.org/10.1016/0893-6080(89)90020-8

    Article  Google Scholar 

  • Jamieson, R. K., Avery, J. E., Johns, B. T., & Jones, M. N. (2018). An instance theory of semantic memory. Computational Brain & Behavior, 1, 119–136.

    Article  Google Scholar 

  • Johns, B. T. (2021). Distributional social semantics: Inferring word meanings from communication patterns. Cognitive Psychology, 131, 101441.

    Article  PubMed  Google Scholar 

  • Johns, B. T., Jamieson, R. K., & Jones, M. N. (2023). Scalable cognitive modelling: Putting Simon’s (1969) ant back on the beach. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale.

  • Jones, M. N. (2016). Big data in cognitive science. Psychology Press.

    Book  Google Scholar 

  • Jones, M. N., & Mewhort, D. J. (2007). Representing word meaning and order information in a composite holographic lexicon. Psychological Review, 114(1), 1.

    Article  PubMed  Google Scholar 

  • Jones, S., Cotterill, R., Dewdney, N., Muir, K., & Joinson, A. (2014). Finding Zelig in text: a measure for normalising linguistic accommodation. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (pp. 455–465). https://aclanthology.org/C14-1044

  • Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., ... & Gray, S. (2020). Scaling laws for neural language models. arXiv:2001.08361

  • Keblusek, L., Giles, H., & Maass, A. (2017). Communication and group life: How language and symbols shape intergroup relations. Group Processes & Intergroup Relations, 20(5), 632–643. https://doi.org/10.1177/1368430217708864 [Publisher: SAGE Publications Ltd]

  • Khan, A. (2019). Text mining to understand gender issues: Stories from the red pill, men’s rights, and feminism movements [Accepted: 2019-08-28T15:36:59Z Publisher: University of Waterloo]. Retrieved May 29, 2022, from https://uwspace.uwaterloo.ca/handle/10012/14973

  • Krendel, A., McGlashan, M., & Koller, V. (2021). The representation of gendered social actors across five manosphere communities on Reddit [Number: 2]. Corpora, 17 (2). Retrieved May 25, 2022, from https://eprints.lancs.ac.uk/id/eprint/155332/

  • LaFree, G., Atwell-Seate, A., Pisoiu, D., Stevenson, J., Tinsley, H., Manager, G., & Picarelli, J. (2016). Final report: Empirical assessment of domestic radicalization (EADR) (tech. rep. No. 250481). National Institute of Justice, Office of Justice Programs, U.S. Department of Justice.

  • Landauer, T. K., & Dumais, S. T. (1997). A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211.

    Article  Google Scholar 

  • Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284. https://doi.org/10.1080/01638539809545028

    Article  Google Scholar 

  • Landauer, T. K., McNamara, D. S., Dennis, S., & Kintsch, W. (2013). Handbook of latent semantic analysis. Psychology Press.

    Google Scholar 

  • LaViolette, J., & Hogan, B. (2019). Using platform signals for distinguishing discourses: The case of Men’s rights and Men’s liberation on reddit. Proceedings of the Thirteenth International AAAI Conference on Web and Social Media (ICWSM), 323–334.

  • Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.

    Article  Google Scholar 

  • MacIntyre, P. D. (2019). Anxiety/uncertainty management and communication accommodation in women’s brief dyadic conversations with a stranger: an idiodynamic approach. SAGE Open, 9(3), 215824401986148. https://doi.org/10.1177/2158244019861482

    Article  Google Scholar 

  • Male Supremacy. (2021). Retrieved July 30, 2022, from https://www.splcenter.org/fighting-hate/extremistfiles/ideology/male-supremacy

  • Mange, J., Lepastourel, N., & Georget, P. (2009). Is your language a social clue? Lexical markers and social identity. Journal of Language and Social Psychology, 28(4), 364–380. https://doi.org/10.1177/0261927X09341956 [Publisher: SAGE Publications Inc]

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space.

  • Nganga, S. W. (2020). Winning through (Dis)Alignment: The language of the Kenyan female politicians. In C. Kioko, R. Kagumire, & M. Matandela (Eds.), Challenging patriarchy: The role of patriarchy in the roll-back of democracy (pp. 150–162). Heinrich-Böll-Stiftung.

    Google Scholar 

  • Nishida, S., Blanc, A., Maeda, N., Kado, M., & Nishimoto, S. (2021). Behavioral correlates of cortical semantic representations modeled by word vectors. PLOS Computational Biology, 17(6), e1009138. https://doi.org/10.1371/journal.pcbi.1009138 [Publisher: Public Library of Science]

  • Park, A., & Conway, M. (2018). Harnessing reddit to understand the written-communication challenges experienced by individuals with mental health disorders: Analysis of texts from mental health communities. Journal of Medical Internet Research, 20(4), e8219 . https://doi.org/10.2196/jmir.8219 [Company: Journal of Medical Internet Research Distributor: Journal of Medical Internet Research Institution: Journal of Medical Internet Research Label: Journal of Medical Internet Research Publisher: JMIR Publications Inc., Toronto, Canada]

  • Paxton, A., Dale, R., & Richardson, D. C. (2016). Social coordination of verbal and nonverbal behaviours. In P. Passos, K. Davids, & J.Y. Chow (Eds.), Interpersonal coordination and performance in social systems (p. 259). Routledge.

  • Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162

  • Pérez-Sabater, C., & Maguelouk, M. G. (2019). Managing identity in football communities on Facebook: Language preference and language mixing strategies. Lingua, 225, 32–49. https://doi.org/10.1016/j.lingua.2019.04.003

    Article  Google Scholar 

  • Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27(2), 169–190. https://doi.org/10.1017/S0140525X04000056 [Publisher: Cambridge University Press]

  • Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using siamese BERT-networks. Retrieved May 1, 2020, from arXiv:1908.10084

  • Reitter, D., & Moore, J. D. (2014). Alignment and task success in spoken dialogue. Journal of Memory and Language, 76, 29–46.

    Article  Google Scholar 

  • Ribeiro, M. H., Blackburn, J., Bradlyn, B., De Cristofaro, E., Stringhini, G., Long, S.,... & Greenberg, S. (2021). The evolution of the manosphere across the web (tech. rep. arXiv:2001.07600). arXiv. Retrieved June 14, 2022, from arXiv:2001.07600

  • Roozenbeek, J., & Salvador Palau, A. (2017). I read it on reddit: Exploring the role of online communities in the 2016 US elections news cycle. In G. L. Ciampaglia, A. Mashhadi, & T. Yasseri (Eds.), Social Informatics (pp. 192-220). Springer International Publishing. https://doi.org/10.1007/978-3-319-67256-4_16

  • Rosen, Z. P. (2022). A BERT’s eye view: A big data framework for assessing language convergence and accommodation. Journal of Language and Social Psychology. https://doi.org/10.1177/0261927X221095865

    Article  Google Scholar 

  • Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27, 55.

    Article  Google Scholar 

  • Shatz, I. (2017). Fast, free, and targeted: Reddit as a source for recruiting participants online. Social Science Computer Review, 35(4), 537–549. https://doi.org/10.1177/0894439316650163 [Publisher: SAGE Publications Inc]

  • Shin, H., & Doyle, G. (2018). Alignment, acceptance, and rejection of group identities in online political discourse. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, 1–8. https://doi.org/10.18653/v1/N18-4001

  • Smaldino, P. E. (2019). Social identity and cooperation in cultural evolution. Behavioural Processes, 161, 108–116. https://doi.org/10.1016/j.beproc.2017.11.015

    Article  PubMed  Google Scholar 

  • Smaldino, P. E., Flamson, T. J., & McElreath, R. (2018). The evolution of covert signaling. Scientific Reports, 8(1), 4905. https://doi.org/10.1038/s41598-018-22926-1

    Article  PubMed  PubMed Central  Google Scholar 

  • Soler, A. G., & Apidianaki, M. (2021). Let’s play mono-poly: BERT can reveal words’ polysemy level and partitionability into senses. Retrieved September 13, 2022, from arXiv:2104.14694

  • Soliz, J., Giles, H., & Gasiorek, J. (2021). Communication accommodation theory: Converging toward an understanding of communication adaptation in interpersonal relationships. In D.O. Braithwaite, & P. Schrodt (Eds.), Engaging Theories in Interpersonal Communication: Multiple Perspectives (3rd ed., pp. 130-142). Routledge. https://doi.org/10.4324/9781003195511

  • Stine, Z. K., & Agarwal, N. (2020). Comparative discourse analysis using topic models: Contrasting perspectives on china from reddit. International Conference on Social Media and Society, 73–84. https://doi.org/10.1145/3400806.3400816

  • Tajfel, H. (1979). Individuals and groups in social psychology*. British Journal of Social and Clinical Psychology, 18(2), 183–190. https://doi.org/10.1111/j.2044-8260.1979.tb00324.x [_eprint: https://www.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2044-8260.1979.tb00324.x]

  • Tajfel, H., Billig, M. G., Bundy, R. P., & Flament, C. (1971). Social categorization and intergroup behaviour. European Journal of Social Psychology, 1(2), 149–178. https://doi.org/10.1002/ejsp.2420010202 [_eprint: https://www.onlinelibrary.wiley.com/doi/pdf/10.1002/ejsp.2420010202]

  • Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54. https://doi.org/10.1177/0261927X09351676

    Article  Google Scholar 

  • Terren, L., & Borge-Bravo, R. (2021). Echo chambers on social media: A systematic review of the literature. Review of Communication Research, 9, 99–118. Retrieved September 10, 2023, from https://rcommunicationr.org/index.php/rcr/article/view/94

  • Tolston, M. T., Riley, M. A., Mancuso, V., Finomore, V., & Funke, G. J. (2019). Beyond frequency counts: Novel conceptual recurrence analysis metrics to index semantic coordination in team communications. Behavior Research Methods, 51, 342–360.

    Article  PubMed  Google Scholar 

  • Utsumi, A. (2020). Exploring what is encoded in distributional word vectors: A neurobiologically motivated analysis. Cognitive Science, 44(6), e12844. https://doi.org/10.1111/cogs.12844 [_eprint: https://www.onlinelibrary.wiley.com/doi/pdf/10.1111/cogs.12844]

  • Velásquez, N., Manrique, P., Sear, R., Leahy, R., Restrepo, N. J., & Illari, L.,... & Lupu, Y. (2021). Hidden order across online extremist movements can be disrupted by nudging collective chemistry. Scientific Reports, 11(1), 9965. https://doi.org/10.1038/s41598-021-89349-3

  • Villa, G., Pasi, G., & Viviani, M. (2021). Echo chamber detection and analysis. Social Network Analysis and Mining, 11(1), 78. https://doi.org/10.1007/s13278-021-00779-3

  • Villalobos, P., Sevilla, J., Heim, L., Besiroglu, T., Hobbhahn, M., & Ho, A. (2022). Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning. https://doi.org/10.48550/arXiv.2211.04325, arXiv:2211.04325

  • Wiedemann, G., Remus, S., Chawla, A., & Biemann, C. (2019). Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. Retrieved September 13, 2022, from arXiv:1909.10430

  • Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A.,... & Cistac, P. (2020). HuggingFace’s transformers: State-of-the-art natural language processing. Retrieved September 30, 2021, from arXiv:1910.03771

  • Wurm, L. H., & Fisicaro, S. A. (2014). What residualizing predictors in regression analyses does (and what it does not do). Journal of Memory and Language, 72, 37–48. https://doi.org/10.1016/j.jml.2013.12.003

  • Xu, Y., & Reitter, D. (2015). An evaluation and comparison of linguistic alignment measures. In Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics (pp. 58–67). https://doi.org/10.3115/v1/W15-1107

  • Yenicelik, D., Schmidt, F., & Kilcher, Y. (2020). How does BERT capture semantics? A closer look at polysemous words. Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 156–162. https://doi.org/10.18653/v1/2020.blackboxnlp-1.15

  • Zhang, D., Lin, H., Liu, X., Zhang, H., & Zhang, S. (2019). Combining the attention network and semantic representation for Chinese verb metaphor identification. IEEE Access, 7, 137103–137110. https://doi.org/10.1109/ACCESS.2019.2932136

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zachary P Rosen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rosen, Z.P., Dale, R. BERTs of a feather: Studying inter- and intra-group communication via information theory and language models. Behav Res (2023). https://doi.org/10.3758/s13428-023-02267-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.3758/s13428-023-02267-2

Keywords

Navigation