Perceived Versus Actual Predictability of Personal Information in Social Networks

  • Eleftherios Spyromitros-XioufisEmail author
  • Georgios Petkos
  • Symeon Papadopoulos
  • Rob Heyman
  • Yiannis Kompatsiaris
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9934)


This paper looks at the problem of privacy in the context of Online Social Networks (OSNs). In particular, it examines the predictability of different types of personal information based on OSN data and compares it to the perceptions of users about the disclosure of their information. To this end, a real life dataset is composed. This consists of the Facebook data (images, posts and likes) of 170 people along with their replies to a survey that addresses both their personal information, as well as their perceptions about the sensitivity and the predictability of different types of information. Importantly, we evaluate several learning techniques for the prediction of user attributes based on their OSN data. Our analysis shows that the perceptions of users with respect to the disclosure of specific types of information are often incorrect. For instance, it appears that the predictability of their political beliefs and employment status is higher than they tend to believe. Interestingly, it also appears that information that is characterized by users as more sensitive, is actually more easily predictable than users think, and vice versa (i.e. information that is characterized as relatively less sensitive is less easily predictable than users might have thought).


Privacy Social networks Personal attributes Inference 



This work is supported by the USEMP FP7 project, partially funded by the EC under contract number 611596.


  1. 1.
    Acquisti, A.: The economics and behavioral economics of privacy. In: Lane, J., Stodden, V., Bender, S., Nissenbaum, H. (eds.) Privacy, Big Data, and the Public Good: Frameworks for Engagement, pp. 98–112. Cambridge University Press (2014)Google Scholar
  2. 2.
    Acquisti, A., Fong, C.M.: An experiment in hiring discrimination via online social networks. (2015). Available at SSRN 2031979Google Scholar
  3. 3.
    Agarwal, L., Shrivastava, N., Jaiswal, S., Panjwani, S.: Do not embarrass: re-examining user concerns for online tracking and advertising. In: Proceedings of the Ninth Symposium on Usable Privacy and Security (2013)Google Scholar
  4. 4.
    Backstrom, L., Kleinberg, J., Romantic partnerships, the dispersion of social ties: a network analysis of relationship status on facebook. In: Proceedings of CSCW 2014, pp. 831–841. ACM (2014)Google Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  6. 6.
    Brandimarte, L., Acquisti, A., Loewenstein, G.: Misplaced confidences: privacy and the control paradox. In: Ninth Annual Workshop on the Economics of InformationSecurity, p. 43, Cambridge (2010)Google Scholar
  7. 7.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  8. 8.
    Conover, M.D., Goncalves, B., Ratkiewicz, J., Flammini, A., Menczer, F.: Predicting the political alignment of twitter users. In: Privacy, Security, Risk and Trust (PASSAT) and SocialCom 2011, pp. 192–199 (2011)Google Scholar
  9. 9.
    Debatin, B., Lovejoy, J.P., Horn, A.-K., Hughes, B.N.: Facebook and online privacy: attitudes, behaviors, and unintended consequences. J. Comput. Mediated Commun. 15(1), 83–108 (2009)CrossRefGoogle Scholar
  10. 10.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  11. 11.
    World Economic Forum. Rethinking personal data: strengthening trust. Technical report, May 2012Google Scholar
  12. 12.
    Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. ICML 96, 148–156 (1996)Google Scholar
  13. 13.
    Ginsca, A.L., Popescu, A., Le Borgne, H., Ballas, N., Vo, P., Kanellos, I.: Large-scale image mining with flickr groups. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 318–334. Springer, Heidelberg (2015)Google Scholar
  14. 14.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  15. 15.
    Heyman, R., De Wolf, R., Pierson, J.: Evaluating social media privacy settings for personal, advertising purposes. Info 16(4), 18–32 (2014)CrossRefGoogle Scholar
  16. 16.
    Jernigan, C., Mistree, B.F., Gaydar: Facebook friendships expose sexual orientation. First Monday, 14(10) (2009)Google Scholar
  17. 17.
    Knijnenburg, B.P., Kobsa, A., Jin, H.: Dimensionality of information disclosure behavior. Int. J. Hum. Comput. Stud. 71(12), 1144–1162 (2013)CrossRefGoogle Scholar
  18. 18.
    Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proc. Nat. Acad. Sci. 110(15), 5802–5805 (2013)CrossRefGoogle Scholar
  19. 19.
    Madejski, M., Johnson, M., Bellovin, S.M.: A study of privacy settings errors in an online social network. In: PERCOM Workshops (2012)Google Scholar
  20. 20.
    Nissenbaum, H.: Privacy as contextual integrity. Wash. L. Rev. 79, 101–139 (2004)Google Scholar
  21. 21.
    Pennacchiotti, M., Popescu, A.-M.: Democrats, republicans, starbucks afficionados: user classification in twitter. In: SIGKDD (2011)Google Scholar
  22. 22.
    Petkos, G., Papadopoulos, S., Kompatsiaris, Y.: PScore: A framework for enhancing privacy awareness in online social networks. In: Availability, Reliability and Security (ARES 2015), pp. 592–600. IEEE (2015)Google Scholar
  23. 23.
    Petronio, S.S.: Boundaries of Privacy: Dialectics of Disclosure. SUNY series in communication studies. State University of New York Press, Albany (2002)Google Scholar
  24. 24.
    Raman, A.S., Barloon, J.L., Welch, D.M.: Social media: emerging fair lending issues. Rev. Banking Financial Serv. 28(7), 81–88 (2012)Google Scholar
  25. 25.
    Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM (2010)Google Scholar
  26. 26.
    Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: ICDM 2008, pp. 995–1000 (2008)Google Scholar
  27. 27.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Andrew Schwartz, H., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Ramones, S.M., Agrawal, M., Shah, A., Kosinski, M., Stillwell, D., Seligman, M.E.P., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS one 8(9), e73791 (2013)CrossRefGoogle Scholar
  29. 29.
    Spyromitros-Xioufis, E., Papadopoulos, S., Popescu, A., Kompatsiaris, Y.: Personalized privacy-aware image classification. In: Proceedings of the 6th ACM International Conference on Multimedia Retrieval, ICMR 2016 (2016)Google Scholar
  30. 30.
    Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Machine Learning, pp. 1–44 (2016)Google Scholar
  31. 31.
    Stutzman, F., Gross, R., Acquisti, A.: Silent listeners: the evolution of privacy and disclosure on Facebook. J. Privacy Confidentiality 4(2), 7–41 (2012)Google Scholar
  32. 32.
    Theodoridis, T., Papadopoulos, S., Kompatsiaris, Y.: Assessing the reliability of facebook user profiling. In: WWW (2015)Google Scholar
  33. 33.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, New York (2009)CrossRefGoogle Scholar
  34. 34.
    Westin, A.: Privacy and Freedom. Bodley Head, London (1970)Google Scholar
  35. 35.
    Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Eleftherios Spyromitros-Xioufis
    • 1
    Email author
  • Georgios Petkos
    • 1
  • Symeon Papadopoulos
    • 1
  • Rob Heyman
    • 2
  • Yiannis Kompatsiaris
    • 1
  1. 1.CERTH-ITIThessalonikiGreece
  2. 2.iMinds-SMIT, Vrije Universiteit BrusselBrusselsBelgium

Personalised recommendations