Skip to main content
Log in

Hiding Your Face Is Not Enough: user identity linkage with image recognition

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

People tend to have multiple identities or personalities in their real and on-line lives. In the real life, these identities can be even associated with different names used with parents, groups of friends or in formal contexts. In the on-line side of life, the attitude has exploded: people have the possibility to express different identities with different names in different social networks (SNs), interfacing with these tools claiming the same meaning as the actions and connections in real life. Thus, a fundamental question arises—Can profiles of the same user be connected in multiple SNs? In this paper, we present Hiding Your Face Is Not Enough (HYFINE) model: a User Identity Linking model that fully exploits images in profiles. Our HYFINE model consists of two parts: (1) the corpus extraction system; (2) the classification system HYFINE-c, which classify if two profiles to determine if these profiles are two different identities of the same user by fully using images along with other features. We show that HYFINE model, exploiting images in profiles, can match profiles of the users in different SNs with high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. https://www.twitter.com.

  2. https://www.facebook.com.

  3. https://www.instagram.com.

  4. https://www.linkedin.com.

  5. The system is available on https://github.com/leooJo/SeleniumWebScraper.

  6. https://en.wikipedia.org/wiki/List_of_most-followed_Twitter_accounts.

  7. https://en.wikipedia.org/wiki/List_of_most-followed_Facebook_pages.

  8. https://en.wikipedia.org/wiki/List_of_most-followed_Instagram_accounts.

  9. https://www.seleniumhq.org/.

  10. www.unece.org.

  11. http://opencv.org/.

  12. The implementation is available in PythonBook format at the following link: https://github.com/leooJo/SeleniumWebScraper

References

  • Ahmad W, Ali R (2018) Understanding the users personal attributes selection tendency across social networks, pp 1–6. https://doi.org/10.1109/IoT-SIU.2018.8519905

  • Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string distance metrics for name-matching tasks. In: Proceedings of the 2003 international conference on Information Integration on the Web, IIWEB’03. AAAI Press, pp 73–78. http://dl.acm.org/citation.cfm?id=3104278.3104293

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1023/A:1022627411411

    Article  MATH  Google Scholar 

  • Goga O (2014) Matching user accounts across online social networks: methods and applications (corrélation des profils d’utilisateurs dans les réseaux sociaux : méthodes et applications)

  • Halimi A, Ayday E (2017) Profile matching across unstructured online social networks: threats and countermeasures. CoRR arXiv:1711.01815

  • Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160

    Article  MathSciNet  Google Scholar 

  • Kaushal R, Gupta S, Kumaraguru P (2020) Investigation of biases in identity linkage datasets. In: Hung C, Cerný T, Shin D, Bechini A (eds) SAC ’20: the 35th ACM/SIGAPP Symposium on Applied Computing, online event, [Brno, Czech Republic], March 30–April 3, 2020. ACM, pp 1861–1868. https://doi.org/10.1145/3341105.3374015

  • King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758

    Google Scholar 

  • Korula N, Lattanzi S (2014) An efficient reconciliation algorithm for social networks. Proc VLDB Endow 7(5):377–388. https://doi.org/10.14778/2732269.2732274

    Article  Google Scholar 

  • Lee RK, Hee MS, Prasetyo PK, Lim E (2019) Linky: visualizing user identity linkage results for multiple online social networks. CoRR arXiv:1902.08737

  • Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710 (doklady Akademii Nauk SSSR, V163 No4 845–848 1965)

  • Liu S, Wang S, Zhu F, Zhang J, Krishnan R (2014) Hydra: large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD ’14. ACM, New York, NY, USA, pp 51–62. https://doi.org/10.1145/2588555.2588559

  • Lovdata (2019) Imagehash library. https://pypi.org/project/ImageHash/

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  • Mishra R (2019) Entity resolution in online multiple social networks (@Facebook and LinkedIn). In: Proceedings of IEMIS 2018, vol 2, pp 221–237. https://doi.org/10.1007/978-981-13-1498-8_20

  • Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, SP ’09. IEEE Computer Society, Washington, DC, USA, pp 173–187. https://doi.org/10.1109/SP.2009.22

  • Nunes A, Calado P, Martins B (2012) Resolving user identities over social networks through supervised learning and rich similarity features. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC ’12. ACM, New York, NY, USA, pp 728–729. https://doi.org/10.1145/2245276.2245413

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Peled O, Fire M, Rokach L, Elovici Y (2016) Matching entities across online social networks. Neurocomputing 210(C):91–106. https://doi.org/10.1016/j.neucom.2016.03.089

    Article  Google Scholar 

  • Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  • Shu K, Wang S, Tang J, Zafarani R, Liu H (2017) User identity linkage across online social networks: a review. SIGKDD Explor Newsl 18(2):5–17. https://doi.org/10.1145/3068777.3068781

    Article  Google Scholar 

  • Tichy W (1984) The string-to-string correction problem with block moves. ACM Trans Comput Syst 2:309–321. https://doi.org/10.1145/357401.357404

    Article  Google Scholar 

  • Vosecky J, Hong D, Shen V (2009) User identification across multiple social networks, pp 360–365. https://doi.org/10.1109/NDT.2009.5272173

  • Wang Z, Bovik A, Sheikh H (2005) Structural similarity based image quality assessment. In: Digital Video Image Quality and Perceptual Coding, Ser Series in Signal Processing and Communications. https://doi.org/10.1201/9781420027822.ch7

  • Wang J, Li G, Fe J (2011) Fast-join: an efficient method for fuzzy token matching based string similarity join. In: 2011 IEEE 27th International Conference on Data Engineering, pp 458–469. https://doi.org/10.1109/ICDE.2011.5767865

  • Wondracek G, Holz T, Kirda E, Kruegel C (2010) A practical attack to de-anonymize social network users. In: 2010 IEEE Symposium on Security and Privacy, pp 223–238. https://doi.org/10.1109/SP.2010.21

  • Zafarani R, Liu H (2013) Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13. ACM, New York, NY, USA, pp 41–49. https://doi.org/10.1145/2487575.2487648

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leondardo Ranaldi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranaldi, L., Zanzotto, F.M. Hiding Your Face Is Not Enough: user identity linkage with image recognition. Soc. Netw. Anal. Min. 10, 56 (2020). https://doi.org/10.1007/s13278-020-00673-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-020-00673-4

Keywords

Navigation