User characterization for online social networks

Tuna, Tayfun; Akbas, Esra; Aksoy, Ahmet; Canbaz, Muhammed  Abdullah; Karabiyik, Umit; Gonen, Bilal; Aygun, Ramazan

doi:10.1007/s13278-016-0412-3

User characterization for online social networks

Review Article
Published: 04 November 2016

Volume 6, article number 104, (2016)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Tayfun Tuna¹,
Esra Akbas²,
Ahmet Aksoy³,
Muhammed Abdullah Canbaz³,
Umit Karabiyik⁴,
Bilal Gonen⁵ &
…
Ramazan Aygun⁶

1894 Accesses
30 Citations
11 Altmetric
Explore all metrics

Abstract

Online social network analysis has attracted great attention with a vast number of users sharing information and availability of APIs that help to crawl online social network data. In this paper, we study the research studies that are helpful for user characterization as online users may not always reveal their true identity or attributes. We especially focused on user attribute determination such as gender and age; user behavior analysis such as motives for deception; mental models that are indicators of user behavior; user categorization such as bots versus humans; and entity matching on different social networks. We believe our summary of analysis of user characterization will provide important insights into researchers and better services to online users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adali S, Golbeck J (2012) Predicting personality with social behavior. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), pp 302–309
Alowibdi JS, Buy UA, Yu P (2013) Empirical evaluation of profile characteristics for gender classification on twitter. In: Proceedings of the 2013 12th international conference on machine learning and applications, vol 1. ICMLA ’13IEEE Computer Society, Washington, DC, USA, pp 365–369
Alowibdi JS, Buy UA, Yu P (2013) Language independent gender classification on twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’13ACM, New York, NY, USA, pp 739–743
Alowibdi JS, Buy U, Yu PS, Stenneth L et al (2014) Detecting deception in online social networks. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 383–390
Amichai-Hamburger Y, Vinitzky G (2010) Social network use and personality. Comput Hum Behav 26(6):1289–1295
Article Google Scholar
Aydogan A, Tuna T, Yildirim A (2016) Does political elites represent their followers? Quantitative text analysis of Turkish tweets. In: Proceedings of the midwest political science association 74th annual conference. Chicago, USA
Backstrom L, Kleinberg J, Kumar R, Novak J (2008) Spatial variation in search engine queries. In: Proceedings of the 17th international conference on world wide web. WWW ’08ACM, New York, NY, USA, pp 357–366
Benevenuto F, Rodrigues T, Cha M, Almeida V (2009) Characterizing user behavior in online social networks. In: Proceedings of the 9th ACM SIGCOMM conference on internet measurement conference, pp 49–62
Bhaskaran N, Nwogu I, Frank MG, Govindaraju V (2011) Lie to me: deceit detection via online behavioral learning. In: 2011 IEEE international conference on automatic face & gesture recognition and workshops (FG 2011), pp 24–29
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Blythe J, Camp LJ (2012) Implementing mental models. In: 2012 IEEE symposium on security and privacy workshops (SPW), pp 86–90
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MathSciNet MATH Google Scholar
Burger JD, Henderson J, Kim G, Zarrella G (2011) Discriminating gender on twitter. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’11. Association for Computational Linguistics, pp 1301–1309
Burt DM, Perrett DI (1995) Perception of age in adult caucasian male faces: computer graphic manipulation of shape and colour information. P R Soc Lond B Biol Sci 259(1355):137–143. doi:10.1098/rspb.1995.0021
Article Google Scholar
Chang Hw, Lee D, Eltaher M, Lee J (2012) @phillies tweeting from philly? Predicting twitter user locations with spatial word usage. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining, ASONAM ’12. IEEE Computer Society, Washington, DC, USA, pp 111–118
Cheng N, Chandramouli R, Subbalakshmi KP (2011) Author gender identification from text. Digit Investig 8(1):78–88
Article Google Scholar
Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management. CIKM ’10ACM, New York, NY, USA, pp 759–768
Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th annual computer security applications conference. ACSAC ’10ACM, New York, NY, USA, pp 21–30
Ciot M, Sonderegger M, Ruths D (2013) Gender inference of twitter users in non-English contexts. In: EMNLP. ACL, pp 1136–1145
Coleman JS (1988) Social capital in the creation of human capital. Am J Sociol 94:S95–S120
Article Google Scholar
Darmon D, Sylvester J, Girvan M, Rand W (2013) Predictability of user behavior in social media: Bottom-up v. top-down modeling. In: 2013 international conference on social computing (SocialCom), pp 102–107
Deitrick W, Miller Z, Valyou B, Dickinson B, Munson T, Hu W (2012) Gender identification on twitter using the modified balanced winnow. Commun Netw 4(3):189–195
Article Google Scholar
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874
MATH Google Scholar
Fazeen M, Dantu R, Guturu P (2011) Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc Netw Anal Min 1(3):241–254
Article Google Scholar
Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59(7):96–104
Article Google Scholar
Fire M, Kagan D, Elyashar A, Elovici Y (2014) Friend or foe? Fake profile identification in online social networks. Soc Netw Anal Min 4(1):1–23
Article Google Scholar
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Saitta L (ed) Proceedings of the thirteenth international conference on machine learning (ICML 1996). Morgan Kaufmann, pp 148–156
Friedman J, Hastie T, Tibshirani R et al (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407
Article MathSciNet MATH Google Scholar
Friedman C, Sideli R (1992) Tolerating spelling errors during patient validation. Comput Biomed Res 25(5):486–509
Article Google Scholar
Garg V, Nilizadeh S (2013) Craigslist scams and community composition: investigating online fraud victimization. In: Security and privacy workshops (SPW), pp 123–126
Golbeck J, Hansen D (2014) A method for computing political preference among Twitter followers. Soc Netw 36:177–184
Article Google Scholar
Golbeck J, Robles C, Turner K (2011) Predicting personality with social media. In: CHI ’11 extended abstracts on human factors in computing systems., CHI EA ’11ACM, New York, NY, USA, pp 253–262
Griffin C, Squicciarini A (2012) Toward a game theoretic model of information release in social media with experimental results. In: 2012 IEEE Symposium on security and privacy workshops (SPW), pp 113–116
Grimaudo L, Song HH, Baldi M, Mellia M, Munafo M (2014) Tucan: Twitter user centric analyzer. In: Online social media analysis and visualization. Springer, pp 63–79
Gyarmati L, Trinh TA (2010) Measuring user behavior in online social networks. IEEE Netw 24(5):26–31
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Harris interactive public relations research. A study of social networks scams
Holmes D, McCabe M (2002) Improving precision and recall for soundex retrieval. In: Proceedings of international conference on information technology: coding and computing, 2002, pp 22–26
Ikeda K, Hattori G, Matsumoto K, Ono C, Higashino T (2012) Demographic estimation of twitter users for marketing analysis. IPSJ Trans Consum Devices Syst 2(1):82–93
Google Scholar
Ito J, Hoshide T, Toda H, Uchiyama T, Nishida K (2013) What is he/she like?: Estimating twitter user attributes from contents and social neighbors. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’13ACM, New York, NY, USA, pp 1448–1450
Jin L, Chen Y, Wang T, Hui P, Vasilakos AV (2013) Understanding user behavior in online social networks: a survey. Commun Mag 51(9):144–150
Article Google Scholar
Johansson F, Kaati L, Shrestha A (2013) Detecting multiple aliases in social media. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp 1004–1011
John O, Donahue E, Kentle R (1991) The big five inventory—versions 4a and 54, Berkeley: University of California. Institute of Personality and Social Research, Berkeley
Klimt B, Yang Y (2004) The Enron corpus: a new dataset for email classification research. In: Machine learning: ECML 2004. Springer, pp 217–226
Kohli S, Gupta A (2014) Modeling anonymous human behavior using social media. In: 2014 9th international conference for internet technology and secured transactions (ICITST), pp 409–412
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1):161–205
Article MATH Google Scholar
Lee ES (1966) A theory of migration. Demography 3(1):47–57
Article Google Scholar
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Google Scholar
Lotka AJ (1926) The frequency distribution of scientific productivity. J Wash Acad Sci 16:316–322
Google Scholar
Maia M, Almeida J, Almeida V (2008) Identifying user behavior in online social networks. In: Proceedings of the 1st workshop on social network systems, pp 1–6
Mazumder A, Das A, Kim N, Gokalp S, Sen A, Davulcu H (2013) Spatio-temporal signal recovery from political tweets in Indonesia. In: 2013 international conference on social computing (SocialCom), pp 280–287
Moon B (1995) Paradigms in migration research: exploring ‘moorings’ as a schema. Prog Hum Geogr 19(4):504–524
Article Google Scholar
Moskowitz DS, Zuroff DC (2004) Flux, pulse, and spin: dynamic additions to the personality lexicon. J Pers Soc Psychol 86(6):880–893
Article Google Scholar
Moskowitz DS, Zuroff DC (2005) Robust predictors of flux, pulse, and spin. J Res Pers 39:130–147
Article Google Scholar
Murphy CA (2012) The role of perception in age estimation. In: Digital forensics and cyber crime: third international ICST conference, Berlin, Heidelberg, pp 1–16
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Article Google Scholar
Orebaugh A, Allnutt J (2010) Data mining instant messaging communications to perform author identification for cybercrime investigations. In: Digital forensics and cyber crime: first international ICST conference, Berlin, Heidelberg, pp 99–110
Orebaugh A, Allnutt J (2009) Classification of instant messaging communications for forensics analysis. Int J Forensic Comput Sci 1:22–28
Article Google Scholar
Ortega FJ, Troyano JA, Cruz FL, Vallejo CG, EnríQuez F (2012) Propagation of trust and distrust for the detection of trolls in a social network. Comput Netw 56(12):2884–2895
Article Google Scholar
Otte E, Rousseau R (2002) Social network analysis: a powerful strategy, also for the information sciences. J Inf Sci 28(6):441–453
Article Google Scholar
Ozel B (2012) Link and node analysis of gender based collaborations in Turkish social sciences. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), pp 15–19
Paradesi S, Seneviratne O, Kagal L (2012) Policy aware social miner. In: 2012 IEEE symposium on security and privacy workshops (SPW), pp 53–59
Peersman C, Daelemans W, Van Vaerenbergh L (2011) Predicting age and gender in online social networks. In: Proceedings of the 3rd international workshop on search and mining user-generated contents. SMUC ’11ACM, New York, NY, USA, pp 37–44
Peled O, Fire M, Rokach L, Elovici Y (2013) Entity matching in online social networks. In: 2013 international conference on social computing (SocialCom), pp 339–344
Perrin A (2015) Social media usage: 2005–2015. http://www.pewinternet.org/2015/10/08/social-networking-usage-2005-2015/
Quercia D, Kosinski M, Stillwell D, Crowcroft J (2011) Our twitter profiles, our selves: predicting personality with twitter. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom), pp 180–185
Raghavan S (2013) Digital forensic research: current state of the art. CSI Trans ICT 1(1):91–114
Article Google Scholar
Rodriguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630
Article Google Scholar
Sakakura Y, Amagasa T, Kitagawa H (2012) Detecting social bookmark spams using multiple user accounts. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), pp 1153–1158
Savage D, Zhang X, Yu X, Chou P, Wang Q (2014) Anomaly detection in online social networks. Soc Netw 39:62–70
Article Google Scholar
Sayaf R, Rule J, Clarke D (2013) Can users control their data in social software? An ethical analysis of control systems. In: 2013 IEEE security and privacy workshops (SPW), pp 1–4
Selwyn N (2007) screw blackboard... do it on facebook!: an investigation of students educational use of facebook. Ponencia. En: Poke
Serdyukov P, Murdock V, van Zwol R (2009) Placing flickr photos on a map. In: Proceedings of the 32rd international ACM SIGIR conference on research and development in information retrieval. SIGIR ’09ACM, New York, NY, USA, pp 484–491
Song HJ, Son JW, Park SB (2013) Identifying user attributes through non-i.i.d. multi-instance learning. Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’13ACM, New York, NY, USA, pp 1467–1468
Squicciarini A, Griffin C (2014) Why and how to deceive: game results with sociological evidence. Soc Netw Anal Min 4(1):1–13
Article Google Scholar
Stafford G, Yu L (2013) An evaluation of the effect of spam on twitter trending topics. In: 2013 international conference on social computing (SocialCom), pp 373–378
Statista: Leading social networks worldwide as of January 2016, ranked by number of active users (in millions) (2016). http://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/
Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: Proceedings of the 26th annual computer security applications conference. ACSAC ’10ACM, New York, NY, USA, pp 1–9
Tagarelli A, Interdonato R (2013) who’s out there? identifying and ranking lurkers in social networks. In: 2013 IEEE/ACM International Conference on advances in social networks analysis and mining (ASONAM), pp 215–222
Tamburrini N, Cinnirella M, Jansen VA, Bryden J (2015) Twitter users change word usage according to conversation-partner social identity. Soc Netw 40:84–89
Article Google Scholar
Teoh KK, Pourshafie T, Balakrishnan, V (2014) A gender lens perspective of the use of social network in higher education in Malaysia and Australia. In: Proceedings of the 2014 international conference on social computing, p 21
Thomson R, Murachver T (2001) Predicting gender from electronic discourse. Br J Soc Psychol 40(2):193–208. doi:10.1348/014466601164812.
Article Google Scholar
ULAKBIM (2015) Scientific and Technological Research Council of Turkey. http://ulakbim.tubitak.gov.tr/en. Accessed 20 July 2015
Van Laere O, Schockaert S, Dhoedt B (2011) Finding locations of flickr resources using language models and similarity search. In: Proceedings of the 1st ACM international conference on multimedia retrieval, ICMR ’11, vol 8. ACM, New York, NY, USA, pp 48:1–48
Vosecky J, Hong D, Shen V (2009) User identification across multiple social networks. In: First international conference on networked digital technologies, NDT ’09, pp 360–365
Wagner C, Asur S, Hailpern J (2013) Religious politicians and creative photographers: automatic user categorization in twitter. In: Proceedings of the 2013 international conference on social computing. SOCIALCOM ’13IEEE Computer Society, Washington, DC, USA, pp 303–310
Wang D, Irani D, Pu C (2014) Spade: a social-spam analytics and detection framework. Soc Netw Anal Min 4(1):189
Article Google Scholar
Wang Y, Nepali RK (2013) Privacy measurement for social network actor model. In: 2013 international conference on social computing (SocialCom), pp 659–664
Wash R, Rader E (2011) Influencing mental models of security: a research agenda. In: Proceedings of the 2011 workshop on new security paradigms workshop. NSPW ’11ACM, New York, NY, USA, pp 57–66
Winkler WE (1990) String comparator metrics and enhanced decision rules in the Fellegi–Sunter model of record linkage. In: Proceedings of the section on survey research methods. American Statistical Association, pp 354–359
Yang C, Zhang J, Gu G (2014) A taste of tweets: reverse engineering twitter spammers. In: Proceedings of the 30th annual computer security applications conference. ACSAC ’14ACM, New York, NY, USA, pp 86–95
Zangerle E, Specht G (2014) Sorry, I was hacked: a classification of compromised twitter accounts. In: Proceedings of the 29th annual ACM symposium on applied computing. SAC ’14ACM, New York, NY, USA, pp 587–593
Zavadski K (2015) ‘Terrorist’ troll pretended to be ISIS, white supremacist, and Jewish lawyer. www.thedailybeast.com/articles/2015/09/11/terrorist-troll-pretended-to-be-isis-white-supremacist-and-jewish-lawyer.html
Zheng J, Liu S, Ni L (2014) User characterization from geographic topic analysis in online social media. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 464–471

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Houston, Houston, TX, 77204-3010, USA
Tayfun Tuna
Department of Computer Science, Florida State University, Tallahassee, FL, 32306-4530, USA
Esra Akbas
Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, 89557, USA
Ahmet Aksoy & Muhammed Abdullah Canbaz
Department of Computer Science, Sam Houston State University, Huntsville, TX, 77341, USA
Umit Karabiyik
School of Information Technology, University of Cincinnati, Cincinnati, OH, 45221, USA
Bilal Gonen
Department of Computer Science, University of Alabama in Huntsville, Huntsville, AL, 35899, USA
Ramazan Aygun

Authors

Tayfun Tuna
View author publications
You can also search for this author in PubMed Google Scholar
Esra Akbas
View author publications
You can also search for this author in PubMed Google Scholar
Ahmet Aksoy
View author publications
You can also search for this author in PubMed Google Scholar
Muhammed Abdullah Canbaz
View author publications
You can also search for this author in PubMed Google Scholar
Umit Karabiyik
View author publications
You can also search for this author in PubMed Google Scholar
Bilal Gonen
View author publications
You can also search for this author in PubMed Google Scholar
Ramazan Aygun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tayfun Tuna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tuna, T., Akbas, E., Aksoy, A. et al. User characterization for online social networks. Soc. Netw. Anal. Min. 6, 104 (2016). https://doi.org/10.1007/s13278-016-0412-3

Download citation

Received: 28 March 2016
Revised: 23 September 2016
Accepted: 22 October 2016
Published: 04 November 2016
DOI: https://doi.org/10.1007/s13278-016-0412-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

User characterization for online social networks

Abstract

Access this article

Similar content being viewed by others

Influential Users in Social Networks

Analysis of User Profiles in Social Networks

User Identification on Social Networks Through Text Mining Techniques: A Systematic Literature Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

User characterization for online social networks

Abstract

Access this article

Similar content being viewed by others

Influential Users in Social Networks

Analysis of User Profiles in Social Networks

User Identification on Social Networks Through Text Mining Techniques: A Systematic Literature Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation