Skip to main content

Affinity Groups: A Linguistic Analysis for Social Network Groups Identification

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10540))

Included in the following conference series:

  • 3985 Accesses

Abstract

Socially cohesive groups tend to share similar ideas and express themselves in similar ways when posting their thoughts in online social networks. Therefore, some researchers have conducted studies to uncover the issues discussed by groups who are structurally connected in a network. In this study, we take advantage of the language usage patterns present in online communication to unveil affinity groups, i.e. like-minded people, who are not necessarily interacting in the network currently. We analyze 735K tweets written by 620 unique users and compute scores for 14 grammatical categories using the linguistic inquiry word count software (LIWC). With the LIWC scores, we build a vector for each user, apply a similarity measure and feed an affinity propagation clustering algorithm to find the affinity groups. Following the proposed method, clusters of religious activists, journalists, entrepreneurs, among others emerge. We automatically characterize each cluster using a topic modeling algorithm and validate the generated topics with a user study conducted with 200 people. As a result, more than 70% of the participants agreed on their selection. These results confirm that communities share certain similarities in the use of language, traits that characterize their behavior and grouping.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Aiello, L.M., Barrat, A., Schifanella, R., Cattuto, C., Markines, B., Menczer, F.: Friendship prediction and homophily in social media. ACM Trans. Web (TWEB) 6(2), 9 (2012)

    Google Scholar 

  2. Bliss, C.A., Frank, M.R., Danforth, C.M., Dodds, P.S.: An evolutionary algorithm approach to link prediction in dynamic social networks. J. Comput. Sci. 5(5), 750–764 (2014)

    Article  MathSciNet  Google Scholar 

  3. Conover, M.D., Gonçalves, B., Ratkiewicz, J., Flammini, A., Menczer, F.: Predicting the political alignment of twitter users. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), pp. 192–199. IEEE (2011)

    Google Scholar 

  4. Fang, A., Macdonald, C., Ounis, I., Habel, P.: Topics in tweets: a user study of topic coherence metrics for twitter data. In: Ferro, N., Crestani, F., Moens, M.-F., Mothe, J., Silvestri, F., Di Nunzio, G.M., Hauff, C., Silvello, G. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 492–504. Springer, Cham (2016). doi:10.1007/978-3-319-30671-1_36

    Chapter  Google Scholar 

  5. Fire, M., Tenenboim, L., Lesser, O., Puzis, R., Rokach, L., Elovici, Y.: Link prediction in social networks using computationally efficient topological features. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), pp. 73–80. IEEE (2011)

    Google Scholar 

  6. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007). www.psi.toronto.edu/affinitypropagation

    Article  MathSciNet  MATH  Google Scholar 

  7. Godfrey, D., Johns, C., Meyer, C., Race, S., Sadek, C.: A case study in text mining: interpreting twitter data from world cup tweets (2014). arXiv preprint: arXiv:1408.5427

  8. Pearce, W., Holmberg, K., Hellsten, I., Nerlich, B.: Climate change on twitter: topics, communities and conversations about the 2013 IPCC working group 1 report. PloS One 9(4), e94785 (2014)

    Article  Google Scholar 

  9. Pita, O., Baquerizo, G., Vaca, C., Mendieta, J., Villavicencio, M., Rodríguez, J.: Linguistic profiles on microblogging platforms to characterize political leaders: the ecuadorian case on twitter. In: Ecuador Technical Chapters Meeting (ETCM), vol. 1, pp. 1–6. IEEE (2016)

    Google Scholar 

  10. Qiu, L., Lin, H., Ramsay, J., Yang, F.: You are what you tweet: personality expression and perception on twitter. J. Res. Pers. 46(6), 710–718 (2012)

    Article  Google Scholar 

  11. Quercia, D., Askham, H., Crowcroft, J.: Tweetlda: supervised topic classification and link prediction in twitter. In: Proceedings of the 4th Annual ACM Web Science Conference, pp. 247–250. ACM (2012)

    Google Scholar 

  12. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM (2010)

    Google Scholar 

  13. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: Liwc and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)

    Article  Google Scholar 

  14. Wienberg, C., Roemmele, M., Gordon, A.S.: Content-based similarity measures of weblog authors. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 445–452. ACM (2013)

    Google Scholar 

  15. Xiang, R., Neville, J., Rogati, M.: Modeling relationship strength in online social networks. In: Proceedings of the 19th International Conference on World Wide Web, pp. 981–990. ACM (2010)

    Google Scholar 

  16. Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 267–273. ACM (2003)

    Google Scholar 

  17. Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what@ you# tag: does the dual role affect hashtag adoption? In: Proceedings of the 21st International Conference on World Wide Web, pp. 261–270. ACM (2012)

    Google Scholar 

  18. Yarkoni, T.: Personality in 100,000 words: a large-scale analysis of personality and word use among bloggers. J. Res. Pers. 44(3), 363–373 (2010)

    Article  Google Scholar 

  19. Yin, D., Hong, L., Davison, B.D.: Structural link analysis and prediction in microblogs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1163–1168. ACM (2011)

    Google Scholar 

  20. Yu, B., Kaufmann, S., Diermeier, D.: Classifying party affiliation from political speech. J. Inf. Technol. Polit. 5(1), 33–48 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan Mendieta .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Survey Example

According to your opinion, which of the following topics best describe the set of words presented below. Please, choose only one or two topics. Underline the words that justify your selection.

figure a

1.2 A.2 Participants’ Demographics

From the surveyed people, we obtained the demographics described in the following Table 2:

Table 2. Demographics of respondents

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Mendieta, J., Baquerizo, G., Villavicencio, M., Vaca, C. (2017). Affinity Groups: A Linguistic Analysis for Social Network Groups Identification. In: Ciampaglia, G., Mashhadi, A., Yasseri, T. (eds) Social Informatics. SocInfo 2017. Lecture Notes in Computer Science(), vol 10540. Springer, Cham. https://doi.org/10.1007/978-3-319-67256-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67256-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67255-7

  • Online ISBN: 978-3-319-67256-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics