Advertisement

Pairing Users in Social Media via Processing Meta-data from Conversational Files

Conference paper
  • 608 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11932)

Abstract

Massive amounts of data today are being generated from users engaging on social media. Despite knowing that whatever they post on social media can be viewed, downloaded and analyzed by unauthorized entities, a large number of people are still willing to compromise their privacy today. On the other hand though, this trend may change. Improved awareness on protecting content on social media, coupled with governments creating and enforcing data protection laws, mean that in the near future, users may become increasingly protective of what they share. Furthermore, new laws could limit what data social media companies can use without explicit consent from users. In this paper, we present and address a relatively new problem in privacy-preserved mining of social media logs. Specifically, the problem here is the feasibility of deriving the topology of network communications (i.e., match senders and receivers in a social network), but with only meta-data of conversational files that are shared by users, after anonymizing all identities and content. More explicitly, if users are willing to share only (a) whether a message was sent or received, (b) the temporal ordering of messages and (c) the length of each message (after anonymizing everything else, including usernames from their social media logs), how can the underlying topology of sender-receiver patterns be generated. To address this problem, we present a Dynamic Time Warping based solution that models the meta-data as a time series sequence. We present a formal algorithm and interesting results in multiple scenarios wherein users may or may not delete content arbitrarily before sharing. Our performance results are very favorable when applied in the context of Twitter. Towards the end of the paper, we also present interesting practical applications of our problem and solutions. To the best of our knowledge, the problem we address and the solution we propose are unique, and could provide important future perspectives on learning from privacy-preserving mining of social media logs.

Keywords

Social media Privacy Big-data Meta-data Dynamic Time Warping 

Notes

Acknowledgment

This work was supported in part by US National Science Foundation (Grant # 1718071). Any opinions, findings and conclusions are those of the authors alone, and do not reflect views of the funding agency.

References

  1. 1.
    Melis, L., Song, C., De Cristofaro, E., Shmatikov, V.: Exploiting unintended feature leakage in collaborative learning. arXiv preprint arXiv:1805.04049 (2018)
  2. 2.
    Hunt, T., Song, C., Shokri, R., Shmatikov, V., Witchel, E.: Chiron: privacy-preserving machine learning as a service. arXiv preprint arXiv:1803.05961 (2018)
  3. 3.
    Song, C., Ristenpart, T., Shmatikov, V.: Machine learning models that remember too much. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 587–601. ACM (2017)Google Scholar
  4. 4.
    Bost, R., Minaud, B., Ohrimenko, O.: Forward and backward private searchable encryption from constrained cryptographic primitives. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1465–1482. ACM (2017)Google Scholar
  5. 5.
    Demertzis, I., Papamanthou, C.: Fast searchable encryption with tunable locality. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1053–1067. ACM (2017)Google Scholar
  6. 6.
    Jung, A.R.: The influence of perceived Ad relevance on social media advertising: an empirical examination of a mediating role of privacy concern. Comput. Hum. Behav. 70, 303–309 (2017)CrossRefGoogle Scholar
  7. 7.
    Tsay-Vogel, M., Shanahan, J., Signorielli, N.: Social media cultivating perceptions of privacy: a 5-year analysis of privacy attitudes and self-disclosure behaviors among facebook users. New Media Soc. 20(1), 141–161 (2018)CrossRefGoogle Scholar
  8. 8.
    Benton, A., Arora, R., Dredze, M.: Learning multiview embeddings of twitter users. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 14–19 (2016)Google Scholar
  9. 9.
    Huang, S., Zhang, J., Wang, L., Hua, X.: Social friend recommendation based on multiple network correlation. IEEE Trans. Multimedia 18(2), 287–299 (2016).  https://doi.org/10.1109/TMM.2015.2510333CrossRefGoogle Scholar
  10. 10.
    Vatsalan, D., Christen, P.: Privacy-preserving matching of similar patients. J. Biomed. Inform. 59, 285–298 (2016).  https://doi.org/10.1016/j.jbi.2015.12.004. http://www.sciencedirect.com/science/article/pii/S1532046415002841CrossRefGoogle Scholar
  11. 11.
    Randall, S.M., Ferrante, A.M., Boyd, J.H., Bauer, J.K., Semmens, J.B.: Privacy-preserving record linkage on large real world datasets. J. Biomed. Inform. 50, 205–212 (2014).  https://doi.org/10.1016/j.jbi.2013.12.003. http://www.sciencedirect.com/science/article/pii/S1532046413001949. Special Issue on Informatics Methods in Medical PrivacyCrossRefGoogle Scholar
  12. 12.
    Chi, Y., Hong, J., Jurek, A., Liu, W., O’Reilly, D.: Privacy preserving record linkage in the presence of missing values. Inf. Syst. 71, 199–210 (2017).  https://doi.org/10.1016/j.is.2017.07.001. http://www.sciencedirect.com/science/article/pii/S030643791630504XCrossRefGoogle Scholar
  13. 13.
    Fulcher, B.D., Jones, N.S.: Highly comparative feature-based time-series classification. IEEE Trans. Knowl. Data Eng. 26(12), 3026–3037 (2014)CrossRefGoogle Scholar
  14. 14.
    SerrÃ, J., Arcos, J.L.: An empirical evaluation of similarity measures for time series classification. Knowl.-Based Syst. 67, 305–314 (2014).  https://doi.org/10.1016/j.knosys.2014.04.035. http://www.sciencedirect.com/science/article/pii/S0950705114001658CrossRefGoogle Scholar
  15. 15.
    Bellman, R., Kalaba, R.: On adaptive control processes. IRE Trans. Autom. Control. 4(2), 1–9 (1959)CrossRefGoogle Scholar
  16. 16.
    Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. Acoust. Speech Signal Process. 28(6), 623–635 (1980)CrossRefGoogle Scholar
  17. 17.
    Senin, P.: Dynamic time warping algorithm review. Inf. Comput. Sci. 855(1–23), 40 (2008). Department University of Hawaii at Manoa Honolulu, USAGoogle Scholar
  18. 18.
    Chassiakos, Y.L.R., Radesky, J., Christakis, D., Moreno, M.A., Cross, C., et al.: Children and adolescents and digital media. Pediatrics 138(5), e20162593 (2016)CrossRefGoogle Scholar
  19. 19.
    Ballano, S., Uribe, A.C., Munté-Ramos, R.À.: Young users and the digital divide: readers, participants or creators on internet? (2014)CrossRefGoogle Scholar
  20. 20.
    Miller, J.L., Paciga, K.A., Danby, S., Beaudoin-Ryan, L., Kaldor, T.: Looking beyond swiping and tapping: review of design and methodologies for researching young children’s use of digital technologies. Cyberpsychology: J. Psychosoc. Res. Cyberspace 11(3), 6 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringUniversity of South FloridaTampaUSA

Personalised recommendations