Skip to main content

Predicting and Explaining Privacy Risk Exposure in Mobility Data

  • Conference paper
  • First Online:
Discovery Science (DS 2020)

Abstract

Mobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    EU GDPR can be found at the following link: http://bit.ly/1TlgbjI.

  2. 2.

    In our experiments we discretize the risk in two main classes: low risk (privacy risk \(\le 0.5\)) and high risk (privacy risk \(> 0.5\)).

  3. 3.

    Voronoi tessellation obtained by using: http://geoanalytics.net/V-Analytics/.

  4. 4.

    Hyper-parameter settings: https://github.com/francescanaretto/prp.

  5. 5.

    https://scikit-learn.org/stable/.

  6. 6.

    https://github.com/kingfengji/gcForest.

References

  1. Andrienko, N.V., Andrienko, G.L.: Spatial generalization and aggregation of massive movement data. IEEE Trans. Vis. Comput. Graph. 17(2), 205–219 (2011)

    Article  Google Scholar 

  2. Armando, A., et al.: Risk-based privacy-aware information disclosure. Int. J. Secur. Softw. Eng. 6(2), 70–89 (2015)

    Article  Google Scholar 

  3. Baron, B., Musolesi, M.: Interpretable machine learning for privacy-preserving pervasive systems. IEEE Pervasive Comput. 19(1), 73–82 (2020)

    Article  Google Scholar 

  4. Cormode, G., Procopiuc, C.M., Srivastava, D., Tran, T.T.L.: Differentially private summaries for sparse data. In: ICDT 2012, pp. 299–311 (2012)

    Google Scholar 

  5. Craven, M., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: NIPS, pp. 24–30 (1996)

    Google Scholar 

  6. Craven, M.W., Shavlik, J.W.: Using sampling and queries to extract rules from trained neural networks. In: JMLR, pp. 37–45. Elsevier (1994)

    Google Scholar 

  7. Deng, H.: Interpreting tree ensembles with intrees. Int. J. Data Sci. Anal. 7(4), 277–287 (2019). https://doi.org/10.1007/s41060-018-0144-8

    Article  Google Scholar 

  8. Deng, M., et al.: A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requir. Eng. 16(1), 3–32 (2011). https://doi.org/10.1007/s00766-010-0115-7

    Article  Google Scholar 

  9. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  10. Eagle, N., Pentland, A.S.: Eigenbehaviors: identifying structure in routine. Behav. Ecol. Sociobiol. 63, 1057–1066 (2009). https://doi.org/10.1007/s00265-009-0739-0

    Article  Google Scholar 

  11. Guidotti, R., et al.: Factual and counterfactual explanations for black box decision making. IEEE Intell. Syst. 34(6), 14–23 (2019)

    Article  Google Scholar 

  12. Guidotti, R., et al.: A survey of methods for explaining black box models. ACM Comput. Surv. 51, 1–42 (2019)

    Article  Google Scholar 

  13. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: NIPS, pp. 4765–4774 (2017)

    Google Scholar 

  14. Mohammed, N., et al.: Walking in the crowd: anonymizing trajectory data for pattern analysis. In: CIKM, pp. 1441–1444. ACM (2009)

    Google Scholar 

  15. Monreale, A., et al.: Movement data anonymity through generalization. TDP 3(2), 91–121 (2010)

    MathSciNet  Google Scholar 

  16. Monreale, A., et al.: Privacy-preserving distributed movement data aggregation. In: Vandenbroucke, D., Bucher, B., Crompvoets, J. (eds.) Geographic Information Science at the Heart of Europe. Lecture Notes in Geoinformation and Cartography. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-00615-4_13

    Chapter  Google Scholar 

  17. de Montjoye, Y.A., et al.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3, 1376 (2013)

    Article  Google Scholar 

  18. Muntean, C.I., et al.: On learning prediction models for tourists paths. ACM Trans. Intell. Syst. Technol. 7(1), 8:1–8:34 (2015)

    Article  Google Scholar 

  19. Pappalardo, L., et al.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 1–8 (2015)

    Article  Google Scholar 

  20. Pellungrini, R., et al.: A data mining approach to assess privacy risk in human mobility data. ACM TIST 9(3), 31:1–31:27 (2018)

    Google Scholar 

  21. Pratesi, F., et al.: Prudence: a system for assessing privacy risk vs utility in data sharing ecosystems. Trans. Data Priv. 11(2), 139–167 (2018)

    Google Scholar 

  22. Ribeiro, M.T., et al.: “Why should I trust you?”: explaining the predictions of any classifier. In: ACM SIGKDD, pp. 1135–1144 (2016)

    Google Scholar 

  23. Rossi, L., Musolesi, M.: It’s the way you check-in: identifying users in location-based social networks. In: COSN, pp. 215–226. ACM (2014)

    Google Scholar 

  24. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: PODS, p. 188. ACM (1998)

    Google Scholar 

  25. Song, Y., et al.: Not so unique in the crowd: a simple and effective algorithm for anonymizing location data. In: International Workshop on Privacy-Preserving IR: When Information Retrieval Meets Privacy and Security, pp. 19–24 (2014)

    Google Scholar 

  26. Terrovitis, M., Mamoulis, N.: Privacy preservation in the publication of trajectories. In: MDM, pp. 65–72 (2008)

    Google Scholar 

  27. Zhang, Y.L., et al.: Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Trans. Intell. Syst. Technol. 10(5), 1–9 (2019)

    Google Scholar 

  28. Zheng, Y.: Trajectory data mining: an overview. ACM TIST 6(3), 29:1–29:41 (2015)

    MathSciNet  Google Scholar 

  29. Zhou, Z.H., Feng, J.: Deep forest: towards an alternative to deep neural networks. In: IJCAI, pp. 3553–3559 (2017)

    Google Scholar 

Download references

Acknowledgments

This work has been funded by the European projects SoBigData-PlusPlus (Grant Id 871042), XAI (Grant Id 834756) and HumanE-AI-Net (Grant Id 952026).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Pellungrini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Naretto, F., Pellungrini, R., Monreale, A., Nardini, F.M., Musolesi, M. (2020). Predicting and Explaining Privacy Risk Exposure in Mobility Data. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61527-7_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61526-0

  • Online ISBN: 978-3-030-61527-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics