Identifying and validating personality traits-based homophilies for an egocentric network

  • Md. Saddam Hossain MuktaEmail author
  • Mohammed Eunus Ali
  • Jalal Mahmud
Original Article


Social network sites (SNS) have touched the lives of millions of people around the world. People share interests, ideas, photos, activities in the social networks with their family, colleagues, friends and acquaintances. However, the degree of interactions among members widely varies. According to a sociology principle, people with similar personality often interact with each other more frequently. A group of connected people with similar personality traits is termed as a homophily. In this paper, we develop a method to identify homophilies by analyzing the Big5 personality traits of users from their interactions in an egocentric network like Facebook. We observe that our homophilies correctly cluster ranged from 73 to 87 % users for different personality traits. We also present a novel validation technique to verify those extracted homophilies in real life. Note that we are the first to validate the extracted homophilies and compare those with baseline techniques from SNS usage in real life using an interview-based method. We notice that our validation results show different agreements ranged from 0.207 (fair) to 0.709 (substantial) among the raters of those homophilies in real-life .


Regression Classification Clustering Intra-class correlation 



This research is funded by ICT Division, Ministry of Posts, Telecommunications and Information Technology, Government of the People’s Republic of Bangladesh.


  1. Adali S, Golbeck J (2012) Predicting personality with social behavior. In: ASONAM. IEEEGoogle Scholar
  2. Adamopoulos P, Todri V (2015) Personality-based recommendations: evidence from In: Proceedings of the 9th ACM international conference on recommender systemsGoogle Scholar
  3. Aiello LM, Barrat A, Schifanella R, Cattuto C, Markines B, Menczer F (2012) Friendship prediction and homophily in social media. TWEB 6(2):9CrossRefGoogle Scholar
  4. Amer-Yahia S, Roy SB, Chawlat A, Das G, Yu C (2009) Group recommendation: semantics and efficiency. Proc VLDB Endow 2(1):754–765CrossRefGoogle Scholar
  5. Arnaboldi V, Conti M, Passarella A, Dunbar R (2013) Dynamics of personal social relationships in online social networks: a study on twitter. In: Proceedings of the first ACM conference on online social networks. ACM, pp 15–26Google Scholar
  6. Back MD, Stopfer JM, Vazire S, Gaddis S, Schmukle SC, Egloff B, Gosling SD (2010) Facebook profiles reflect actual personality, not self-idealization. Psychol Sci 21:372CrossRefGoogle Scholar
  7. Bisgin H, Agarwal N, Xu X (2010) Investigating homophily in online social networks. In: WI-IAT. IEEE, vol 1, pp 533–536Google Scholar
  8. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  9. Boyd R (2014) Meh: meaning extraction helper (version 1.0.6)Google Scholar
  10. Celli F, Pianesi F, Stillwell D, Kosinski M (2013) Workshop on computational personality recognition (shared task). In: Proceedings of the workshop on computational personality recognitionGoogle Scholar
  11. Chen L, Wu W, He L (2013) How personality influences users’ needs for recommendation diversity? In: CHI’13 extended abstracts on human factors in computing systems. ACM, pp 829–834Google Scholar
  12. Chen J, Hsieh G, Mahmud JU, Nichols J (2014) Understanding individuals’ personal values from social media word use. In: Proceedings of the 17th ACM conference on computer supported cooperative work and social computing. ACM, pp 405–414Google Scholar
  13. Crandall D, Cosley D, Huttenlocher D, Kleinberg J, Suri S (2008) Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 160–168Google Scholar
  14. Dev H, Ali ME, Hashem T (2014) User interaction-based community detection in online social networks. In: DASFAA. Springer, pp 296–310Google Scholar
  15. Fast E, Chen B, Bernstein M (2016) Empath: understanding topic signals in large-scale text. arXiv preprint arXiv:1602.06979
  16. Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861–874MathSciNetCrossRefGoogle Scholar
  17. Feng H, Qian X (2013) Recommendation via user’s personality and social contextual. In: Proceedings of the 22nd ACM international conference on information and knowledge management. ACM, pp 1521–1524Google Scholar
  18. Fisher D (2005) Using egocentric networks to understand communication. IEEE Internet Comput 9(5):20–28CrossRefGoogle Scholar
  19. Gilbert E, Karahalios K (2009) Predicting tie strength with social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 211–220Google Scholar
  20. Golbeck J, Robles C, Turner K (2011) Predicting personality with social media. In: CHI’11. ACM, pp 253–262Google Scholar
  21. Gorla J, Lathia N, Robertson S, Wang J (2013) Probabilistic group recommendation via information matching. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 495–504Google Scholar
  22. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18CrossRefGoogle Scholar
  23. Hamid MN, Naser MA, Hasan MK, Mahmud H (2014) A cohesion-based friend-recommendation system. Soc Netw Anal Min 4(1):1–11CrossRefGoogle Scholar
  24. Hansen D, Shneiderman B, Smith MA (2010) Analyzing social media networks with NodeXL: insights from a connected world. Morgan Kaufmann, Los AltosGoogle Scholar
  25. Hastie T, Qian J (2014) Glmnet vignette. Technical report, StanfordGoogle Scholar
  26. Hornik K, Grün B (2011) Topicmodels: an r package for fitting topic models. J Stat Softw 40(13):1–30Google Scholar
  27. Hsieh G, Chen J, Mahmud JU, Nichols J (2014) You read what you value: understanding personal values and reading interests. In: Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, pp 983–986Google Scholar
  28. Hughes DJ, Rowe M, Batey M, Lee A (2012) A tale of two sites: twitter vs. facebook and the personality predictors of social media usage. Comput Hum Behav 28(2):561–569CrossRefGoogle Scholar
  29. Jiménez D (1998) Dynamically weighted ensemble neural networks for classification. In: The 1998 IEEE international joint conference on neural networks proceedings, 1998. IEEE world congress on computational intelligence. IEEE, vol 1, pp 753–756Google Scholar
  30. John OP (2000) The big five personality test. Accessed 25 July 2016
  31. John OP, Srivastava S (1999) The big five trait taxonomy: history, measurement, and theoretical perspectives. Handb Pers: Theory Res 2(1999):102–138Google Scholar
  32. John OP, Naumann LP, Soto CJ (2008) Paradigm shift to the integrative big five trait taxonomy. Handb Pers: Theory Res 3:114–158Google Scholar
  33. Kafeza E, Kanavos A, Makris C, Chiu D (2013) Identifying personality-based communities in social networks. In: Parsons J, Chiu D (eds) Advances in conceptual modeling. Springer, Hong Kong, pp 7–13Google Scholar
  34. Kafeza E, Kanavos A, Makris C, Vikatos P (2014) T-pice: twitter personality-based influential communities extraction system. In: Parsons J, Chiu D (eds) BigData congress. IEEE, PhD Symposium, Hong Kong, pp 212–219Google Scholar
  35. Koch GG (1983) Intraclass correlation coefficient. In: Encyclopedia of statistical sciences, vol 4. Wiley, pp 212–217Google Scholar
  36. Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci 110(15):5802–5805CrossRefGoogle Scholar
  37. Kuhn M (2008) Caret package. J Stat Softw 28(5):1–26CrossRefGoogle Scholar
  38. Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031CrossRefGoogle Scholar
  39. Lumley T, Miller A (2009) Leaps: regression subset selection. R package version 2.9. See
  40. Marshall MN (1996) Sampling for qualitative research. Fam Pract 13(6):522–526CrossRefGoogle Scholar
  41. McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. NIPS 2012:548–56Google Scholar
  42. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRefGoogle Scholar
  43. Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev: Data Min Knowl Discov 2(1):86–97MathSciNetGoogle Scholar
  44. Norman WT (1963) Toward an adequate taxonomy of personality attributes: replicated factor structure in peer nomination personality ratings. J Abnorm Soc Psychol 66(6):574CrossRefGoogle Scholar
  45. Pennebaker JW, Booth RJ, Francis ME (2007) Linguistic inquiry and word count: Liwc. Austin: liwc. net. Accessed 29 July 2016
  46. Petrocelli T (2014) Closed vs open social networksGoogle Scholar
  47. Polikar R (2006) Ensemble-based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45CrossRefGoogle Scholar
  48. Schwartz HA, Eichstaedt JC et al (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS One 8(9):e73791CrossRefGoogle Scholar
  49. Sill J, Takács G, Mackey L, Lin D (2009) Feature-weighted linear stacking. arXiv preprint arXiv:0911.0460
  50. Sumner C, Byers A, Boochever R, Park GJ (2012) Predicting dark triad personality traits from twitter usage and a linguistic analysis of tweets. In: 2012 11th international conference on machine learning and applications (ICMLA). IEEE, vol 2, pp 386–393Google Scholar
  51. Viera AJ, Garrett JM et al (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363Google Scholar
  52. Yarkoni T (2010) Personality in 100,000 words: a large-scale analysis of personality and word use among bloggers. J Res Pers 44(3):363–373CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2016

Authors and Affiliations

  • Md. Saddam Hossain Mukta
    • 1
    Email author
  • Mohammed Eunus Ali
    • 1
  • Jalal Mahmud
    • 2
  1. 1.Department of Computer Science and EngineeringBangladesh University of Engineering and TechnologyDhaka 1000Bangladesh
  2. 2.IBM Research-AlmadenSan JoseUSA

Personalised recommendations