Abstract
Social network sites (SNS) have touched the lives of millions of people around the world. People share interests, ideas, photos, activities in the social networks with their family, colleagues, friends and acquaintances. However, the degree of interactions among members widely varies. According to a sociology principle, people with similar personality often interact with each other more frequently. A group of connected people with similar personality traits is termed as a homophily. In this paper, we develop a method to identify homophilies by analyzing the Big5 personality traits of users from their interactions in an egocentric network like Facebook. We observe that our homophilies correctly cluster ranged from 73 to 87 % users for different personality traits. We also present a novel validation technique to verify those extracted homophilies in real life. Note that we are the first to validate the extracted homophilies and compare those with baseline techniques from SNS usage in real life using an interview-based method. We notice that our validation results show different agreements ranged from 0.207 (fair) to 0.709 (substantial) among the raters of those homophilies in real-life .
This is a preview of subscription content, access via your institution.



References
Adali S, Golbeck J (2012) Predicting personality with social behavior. In: ASONAM. IEEE
Adamopoulos P, Todri V (2015) Personality-based recommendations: evidence from amazon.com. In: Proceedings of the 9th ACM international conference on recommender systems
Aiello LM, Barrat A, Schifanella R, Cattuto C, Markines B, Menczer F (2012) Friendship prediction and homophily in social media. TWEB 6(2):9
Amer-Yahia S, Roy SB, Chawlat A, Das G, Yu C (2009) Group recommendation: semantics and efficiency. Proc VLDB Endow 2(1):754–765
Arnaboldi V, Conti M, Passarella A, Dunbar R (2013) Dynamics of personal social relationships in online social networks: a study on twitter. In: Proceedings of the first ACM conference on online social networks. ACM, pp 15–26
Back MD, Stopfer JM, Vazire S, Gaddis S, Schmukle SC, Egloff B, Gosling SD (2010) Facebook profiles reflect actual personality, not self-idealization. Psychol Sci 21:372
Bisgin H, Agarwal N, Xu X (2010) Investigating homophily in online social networks. In: WI-IAT. IEEE, vol 1, pp 533–536
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Boyd R (2014) Meh: meaning extraction helper (version 1.0.6)
Celli F, Pianesi F, Stillwell D, Kosinski M (2013) Workshop on computational personality recognition (shared task). In: Proceedings of the workshop on computational personality recognition
Chen L, Wu W, He L (2013) How personality influences users’ needs for recommendation diversity? In: CHI’13 extended abstracts on human factors in computing systems. ACM, pp 829–834
Chen J, Hsieh G, Mahmud JU, Nichols J (2014) Understanding individuals’ personal values from social media word use. In: Proceedings of the 17th ACM conference on computer supported cooperative work and social computing. ACM, pp 405–414
Crandall D, Cosley D, Huttenlocher D, Kleinberg J, Suri S (2008) Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 160–168
Dev H, Ali ME, Hashem T (2014) User interaction-based community detection in online social networks. In: DASFAA. Springer, pp 296–310
Fast E, Chen B, Bernstein M (2016) Empath: understanding topic signals in large-scale text. arXiv preprint arXiv:1602.06979
Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861–874
Feng H, Qian X (2013) Recommendation via user’s personality and social contextual. In: Proceedings of the 22nd ACM international conference on information and knowledge management. ACM, pp 1521–1524
Fisher D (2005) Using egocentric networks to understand communication. IEEE Internet Comput 9(5):20–28
Gilbert E, Karahalios K (2009) Predicting tie strength with social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 211–220
Golbeck J, Robles C, Turner K (2011) Predicting personality with social media. In: CHI’11. ACM, pp 253–262
Gorla J, Lathia N, Robertson S, Wang J (2013) Probabilistic group recommendation via information matching. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 495–504
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Hamid MN, Naser MA, Hasan MK, Mahmud H (2014) A cohesion-based friend-recommendation system. Soc Netw Anal Min 4(1):1–11
Hansen D, Shneiderman B, Smith MA (2010) Analyzing social media networks with NodeXL: insights from a connected world. Morgan Kaufmann, Los Altos
Hastie T, Qian J (2014) Glmnet vignette. Technical report, Stanford
Hornik K, Grün B (2011) Topicmodels: an r package for fitting topic models. J Stat Softw 40(13):1–30
Hsieh G, Chen J, Mahmud JU, Nichols J (2014) You read what you value: understanding personal values and reading interests. In: Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, pp 983–986
Hughes DJ, Rowe M, Batey M, Lee A (2012) A tale of two sites: twitter vs. facebook and the personality predictors of social media usage. Comput Hum Behav 28(2):561–569
Jiménez D (1998) Dynamically weighted ensemble neural networks for classification. In: The 1998 IEEE international joint conference on neural networks proceedings, 1998. IEEE world congress on computational intelligence. IEEE, vol 1, pp 753–756
John OP (2000) The big five personality test. http://www.outofservice.com/bigfive/. Accessed 25 July 2016
John OP, Srivastava S (1999) The big five trait taxonomy: history, measurement, and theoretical perspectives. Handb Pers: Theory Res 2(1999):102–138
John OP, Naumann LP, Soto CJ (2008) Paradigm shift to the integrative big five trait taxonomy. Handb Pers: Theory Res 3:114–158
Kafeza E, Kanavos A, Makris C, Chiu D (2013) Identifying personality-based communities in social networks. In: Parsons J, Chiu D (eds) Advances in conceptual modeling. Springer, Hong Kong, pp 7–13
Kafeza E, Kanavos A, Makris C, Vikatos P (2014) T-pice: twitter personality-based influential communities extraction system. In: Parsons J, Chiu D (eds) BigData congress. IEEE, PhD Symposium, Hong Kong, pp 212–219
Koch GG (1983) Intraclass correlation coefficient. In: Encyclopedia of statistical sciences, vol 4. Wiley, pp 212–217
Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci 110(15):5802–5805
Kuhn M (2008) Caret package. J Stat Softw 28(5):1–26
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031
Lumley T, Miller A (2009) Leaps: regression subset selection. R package version 2.9. See http://CRAN.R-project.org/package=leaps
Marshall MN (1996) Sampling for qualitative research. Fam Pract 13(6):522–526
McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. NIPS 2012:548–56
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev: Data Min Knowl Discov 2(1):86–97
Norman WT (1963) Toward an adequate taxonomy of personality attributes: replicated factor structure in peer nomination personality ratings. J Abnorm Soc Psychol 66(6):574
Pennebaker JW, Booth RJ, Francis ME (2007) Linguistic inquiry and word count: Liwc. Austin: liwc. net. http://www.liwc.net/LIWC2007LanguageManual.pdf. Accessed 29 July 2016
Petrocelli T (2014) Closed vs open social networks
Polikar R (2006) Ensemble-based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45
Schwartz HA, Eichstaedt JC et al (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS One 8(9):e73791
Sill J, Takács G, Mackey L, Lin D (2009) Feature-weighted linear stacking. arXiv preprint arXiv:0911.0460
Sumner C, Byers A, Boochever R, Park GJ (2012) Predicting dark triad personality traits from twitter usage and a linguistic analysis of tweets. In: 2012 11th international conference on machine learning and applications (ICMLA). IEEE, vol 2, pp 386–393
Viera AJ, Garrett JM et al (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363
Yarkoni T (2010) Personality in 100,000 words: a large-scale analysis of personality and word use among bloggers. J Res Pers 44(3):363–373
Acknowledgments
This research is funded by ICT Division, Ministry of Posts, Telecommunications and Information Technology, Government of the People’s Republic of Bangladesh.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mukta, M.S.H., Ali, M.E. & Mahmud, J. Identifying and validating personality traits-based homophilies for an egocentric network. Soc. Netw. Anal. Min. 6, 74 (2016). https://doi.org/10.1007/s13278-016-0383-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-016-0383-4
Keywords
- Regression
- Classification
- Clustering
- Intra-class correlation