Abstract
Profile linking is the ability to connect profiles of a user on different social networks. Linked profiles can help companies to build psychographics of its potential customers and segment them for targeted marketing in a cost-effective way, can help advertisers target personalized ads and can help security practitioners capture detailed characteristics of malicious/fraudulent users. Existing methods link profiles by observing high similarity between most recent (current) values of the attributes like name and username. However, for a section of users who are observed to evolve their attributes over time and choose dissimilar values across their profiles, these current values have low similarity. Existing methods then falsely conclude that profiles refer to different users. To reduce such false conclusions, we suggest to gather rich history of values assigned to an attribute over time and compare attribute histories to link user profiles across networks. We believe that attribute history highlights user preferences and behavior while creating attribute values on a social network. Coexistence of these preferences across profiles on different social networks results in alike attribute histories that suggests profiles potentially refer to a single user. Through this study, we quantify the importance of attribute history for profile linking on a dataset of real-world users with profiles on Twitter, Facebook, Instagram and Tumblr. We show that attribute history correctly links 48 % more profile pairs with non-matching current values that are incorrectly unlinked by existing methods. We further explore if factors such as longevity and availability of attribute history on either profiles affect linking performance. To the best of our knowledge, this is the first study that explores viability of using attribute history to link profiles on social networks.
Similar content being viewed by others
Notes
Tumblr API does not share a unique user_id of a user to keep track of changes to her Tumblr profile; hence, development of an automated tracking system is challenging.
References
Bartunov S, Korshunov A, Park S-T, Ryu W, Lee H (2012) Joint link-attribute user identity resolution in online social networks. In: Proceedings of the 6th international conference on knowledge discovery and data mining, workshop on social network mining and analysis, SNAKDD '12. ACM, San Diego, CA, USA
Chen T, Kaafar MA, Friedman A, Boreli R (2012) Is more always merrier?: a deep dive into online social footprints. In: Proceedings of the 2012 ACM workshop on workshop on online social networks, WOSN '12. ACM, New York, NY, USA, pp 67–72. doi:10.1145/2342549.2342565
Chen Y, Zhuang C, Cao Q, Hui P (2014) Understanding cross-site linking in online social networks. In: Proceedings of the 8th workshop on social network mining and analysis, SNAKDD '14. ACM, New York, NY, USA, pp 61–69. doi:10.1145/2659480.2659498
Cockerell G (2010) Making marketing meaningful. Kendall Hunt Publishing Company, Dubuque
Dewan P, Gupta M, Goyal K, Kumaraguru P (2013) MultiOSN: realtime monitoring of real world events on multiple online social media. In: Proceedings of the 5th IBM collaborative academia research exchange workshop, I-CARE '13. ACM, New York, NY, USA, pp 61–64. doi:10.1145/2528228.2528235
Feizy R, Wakeman I, Chalmers D (2009) Transformation of online representation through time. In: Proceedings of the 2009 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM '09, pp 273–278. doi:10.1109/ASONAM.2009.64
Goga O, Lei H, Parthasarathi SHK, Friedland G, Sommer R, Teixeira R (2013) Exploiting innocuous activity for correlating users across sites. In: Proceedings of the 22nd international conference on world wide web, WWW '13. ACM, New York, NY, USA, pp 447–458. doi:10.1145/2488388.2488428
Grant T (2012) Txt 4n6: method, consistency, and distinctiveness in the analysis of SMS text messages. J Law Econ Policy 21:467
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. In: Machine learning, vol 46. Kluwer Academic Publishers, Hingham, MA, USA, pp 389–422. doi:10.1023/A:1012487302797
Heitz G, Gould S, Saxena A, Koller D (2009) Cascaded classification models: combining models for holistic scene understanding. In: Proceedings of the neural information processing systems, NIPS '09
Iofciu T, Fankhauser P, Abel F, Bischoff K (2011) Identifying users across social tagging systems. In: Proceedings of the 5th international AAAI conference on weblogs and social media, ICWSM '11, pp 522–525
Irani D, Webb S, Li K, Pu C (2009) Large online social footprints—an emerging threat. In: Proceedings of the 2009 international conference on computational science and engineering, CSE '09, vol 3. IEEE Computer Society, Washington, DC, USA, pp 271–276. doi:10.1109/CSE.2009.459
Jain P, Kumaraguru P (2016) On the dynamics of username changing behavior on twitter. In: Proceedings of the 3rd IKDD conference on data science, CODS '16. ACM, New York, NY, USA, pp 61–66. doi:10.1145/2888451.2888452
Jain P, Kumaraguru P, Joshi A (2013) @I Seek 'fb.me': identifying users across multiple online social networks. In: Proceedings of the 22nd international conference on world wide web, WWW '13 Companion. ACM, New York, NY, USA, pp 1259–1268. doi:10.1145/2487788.2488160
Jain P, Kumaraguru P, Joshi A (2015) Other times, other values: leveraging attribute history to link user profiles across online social networks. In: Proceedings of the 26th ACM conference on hypertext & social media, HT '15. ACM, New York, NY, USA, pp 247–255. doi:10.1145/2700171.2791040
Li P, Dong X, Maurino A, Srivastava D (2011) Linking temporal records. In: Proceedings of the international conference on very large data bases, VLDB '11, vol 4. VLDB Endowment, pp 956–967
Liu J, Zhang F, Song X, Song YI, Lin CY, Hon HW (2013) What's in a Name?: an unsupervised approach to link users across communities. In: Proceedings of the 6th ACM international conference on web search and data mining, WSDM '13. ACM, New York, NY, USA, pp 495–504, doi:10.1145/2433396.2433457
Liu S, Wang S, Zhu F, Zhang J, Krishnan R (2014) HYDRA: large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, SIGMOD '14. ACM, New York, NY, USA, pp 51–62. doi:10.1145/2588555.2588559
Liu Y, Kliman-Silver C, Mislove A (2015) The tweets they are a-Changin': evolution of twitter users and behavior. In: Proceedings of the 8th international AAAI conference on weblogs and social media, ICWSM '14, pp 305–314
Malhotra A, Totti L, Meira Jr W, Kumaraguru P, Almeida V (2012) Studying user footprints in different online social networks. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining, ASONAM '12. IEEE Computer Society, pp 1065–1070
Motoyama M, Varghese G (2009) I seek you: searching and matching individuals in social networks. In: Proceedings of the 11th international workshop on web information and data management, WIDM '09. ACM, New York, NY, USA, pp 67–75. doi:10.1145/1651587.1651604
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE symposium on security and privacy, SP '09. IEEE Computer Society, Washington, DC, USA, pp 173–187. doi:10.1109/SP.2009.22
Perito D, Castelluccia C, Kaafar MA, Manils P (2011) How unique and traceable are usernames? In: Proceedings of the 11th international conference on privacy enhancing technologies, PETS '11. Springer, Berlin, Heidelberg, pp 1–17
Shehab M, Ko MN, Touati H (2012) Social networks profile mapping using games. In: Presented as part of the 3rd USENIX conference on web application development, WebApps 12, pp 27–38
Shi X, Nallapati R, Leskovec J, McFarland D, Jurafsky D (2010) Who leads whom: topical lead-lag analysis across corpora. In: Proceedings of the neural information processing systems workshop, NIPS workshop '10. doi:10.1.1.190.4612
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J mol biol 147:195–197
Szomszor MN, Cantador I, Alani H (2008) Correlating user profiles from multiple folksonomies. In: Proceedings of the 19th ACM conference on hypertext and hypermedia, HT '08. ACM, New York, NY, USA, pp 33–42. doi:10.1145/1379092.1379103
Weinstein A (2004) Handbook of market segmentation: strategic targeting for business and technology firms. Haworth Press, Philadelphia
Zafarani R, Liu H (2009) Connecting corresponding identities across communities. In: Proceedings of the international AAAI conference on web and social media, ICWSM '09, pp 354–357. http://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/209
Zafarani R, Liu H (2013) Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD '13. ACM, New York, NY, USA, pp 41–49. doi:10.1145/2487575.2487648
Zhang J, Wang C, Wang J (2014) Learning temporal dynamics of behavior propagation in social networks. In: Proceedings of the AAAI conference on artificial intelligence, AAAI '14, pp 229–236. http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8315
Acknowledgments
We would like to thank members of Precog, a research group at IIIT-Delhi, and members of Cybersecurity Education and Research Centre (CERC), IIIT-Delhi for their constant feedback and support. The research presented is funded by TCS Research Labs, India and the first author is the awardee of TCS research fellowship.
Author information
Authors and Affiliations
Corresponding author
Additional information
An early version of this manuscript appeared in the 2015 ACM Conference on Hypertext and Social Media (HT) (Jain et al. 2015).
Rights and permissions
About this article
Cite this article
Jain, P., Kumaraguru, P. & Joshi, A. Other times, other values: leveraging attribute history to link user profiles across online social networks. Soc. Netw. Anal. Min. 6, 85 (2016). https://doi.org/10.1007/s13278-016-0391-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-016-0391-4