Abstract
Mobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
EU GDPR can be found at the following link: http://bit.ly/1TlgbjI.
- 2.
In our experiments we discretize the risk in two main classes: low risk (privacy risk \(\le 0.5\)) and high risk (privacy risk \(> 0.5\)).
- 3.
Voronoi tessellation obtained by using: http://geoanalytics.net/V-Analytics/.
- 4.
Hyper-parameter settings: https://github.com/francescanaretto/prp.
- 5.
- 6.
References
Andrienko, N.V., Andrienko, G.L.: Spatial generalization and aggregation of massive movement data. IEEE Trans. Vis. Comput. Graph. 17(2), 205–219 (2011)
Armando, A., et al.: Risk-based privacy-aware information disclosure. Int. J. Secur. Softw. Eng. 6(2), 70–89 (2015)
Baron, B., Musolesi, M.: Interpretable machine learning for privacy-preserving pervasive systems. IEEE Pervasive Comput. 19(1), 73–82 (2020)
Cormode, G., Procopiuc, C.M., Srivastava, D., Tran, T.T.L.: Differentially private summaries for sparse data. In: ICDT 2012, pp. 299–311 (2012)
Craven, M., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: NIPS, pp. 24–30 (1996)
Craven, M.W., Shavlik, J.W.: Using sampling and queries to extract rules from trained neural networks. In: JMLR, pp. 37–45. Elsevier (1994)
Deng, H.: Interpreting tree ensembles with intrees. Int. J. Data Sci. Anal. 7(4), 277–287 (2019). https://doi.org/10.1007/s41060-018-0144-8
Deng, M., et al.: A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requir. Eng. 16(1), 3–32 (2011). https://doi.org/10.1007/s00766-010-0115-7
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Eagle, N., Pentland, A.S.: Eigenbehaviors: identifying structure in routine. Behav. Ecol. Sociobiol. 63, 1057–1066 (2009). https://doi.org/10.1007/s00265-009-0739-0
Guidotti, R., et al.: Factual and counterfactual explanations for black box decision making. IEEE Intell. Syst. 34(6), 14–23 (2019)
Guidotti, R., et al.: A survey of methods for explaining black box models. ACM Comput. Surv. 51, 1–42 (2019)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: NIPS, pp. 4765–4774 (2017)
Mohammed, N., et al.: Walking in the crowd: anonymizing trajectory data for pattern analysis. In: CIKM, pp. 1441–1444. ACM (2009)
Monreale, A., et al.: Movement data anonymity through generalization. TDP 3(2), 91–121 (2010)
Monreale, A., et al.: Privacy-preserving distributed movement data aggregation. In: Vandenbroucke, D., Bucher, B., Crompvoets, J. (eds.) Geographic Information Science at the Heart of Europe. Lecture Notes in Geoinformation and Cartography. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-00615-4_13
de Montjoye, Y.A., et al.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3, 1376 (2013)
Muntean, C.I., et al.: On learning prediction models for tourists paths. ACM Trans. Intell. Syst. Technol. 7(1), 8:1–8:34 (2015)
Pappalardo, L., et al.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 1–8 (2015)
Pellungrini, R., et al.: A data mining approach to assess privacy risk in human mobility data. ACM TIST 9(3), 31:1–31:27 (2018)
Pratesi, F., et al.: Prudence: a system for assessing privacy risk vs utility in data sharing ecosystems. Trans. Data Priv. 11(2), 139–167 (2018)
Ribeiro, M.T., et al.: “Why should I trust you?”: explaining the predictions of any classifier. In: ACM SIGKDD, pp. 1135–1144 (2016)
Rossi, L., Musolesi, M.: It’s the way you check-in: identifying users in location-based social networks. In: COSN, pp. 215–226. ACM (2014)
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: PODS, p. 188. ACM (1998)
Song, Y., et al.: Not so unique in the crowd: a simple and effective algorithm for anonymizing location data. In: International Workshop on Privacy-Preserving IR: When Information Retrieval Meets Privacy and Security, pp. 19–24 (2014)
Terrovitis, M., Mamoulis, N.: Privacy preservation in the publication of trajectories. In: MDM, pp. 65–72 (2008)
Zhang, Y.L., et al.: Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Trans. Intell. Syst. Technol. 10(5), 1–9 (2019)
Zheng, Y.: Trajectory data mining: an overview. ACM TIST 6(3), 29:1–29:41 (2015)
Zhou, Z.H., Feng, J.: Deep forest: towards an alternative to deep neural networks. In: IJCAI, pp. 3553–3559 (2017)
Acknowledgments
This work has been funded by the European projects SoBigData-PlusPlus (Grant Id 871042), XAI (Grant Id 834756) and HumanE-AI-Net (Grant Id 952026).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Naretto, F., Pellungrini, R., Monreale, A., Nardini, F.M., Musolesi, M. (2020). Predicting and Explaining Privacy Risk Exposure in Mobility Data. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-61527-7_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61526-0
Online ISBN: 978-3-030-61527-7
eBook Packages: Computer ScienceComputer Science (R0)