Privacy Risk for Individual Basket Patterns

  • Roberto PellungriniEmail author
  • Anna Monreale
  • Riccardo Guidotti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11054)


Retail data are of fundamental importance for businesses and enterprises that want to understand the purchasing behaviour of their customers. Such data is also useful to develop analytical services and for marketing purposes, often based on individual purchasing patterns. However, retail data and extracted models may also provide very sensitive information to possible malicious third parties. Therefore, in this paper we propose a methodology for empirically assessing privacy risk in the releasing of individual purchasing data. The experiments on real-world retail data show that although individual patterns describe a summary of the customer activity, they may be successful used for the customer re-identifiation.



Work partially supported by the EU H2020 Program under the funding scheme “INFRAIA-1-2014-2015: Research Infrastructures”, grant agreement 654024 “SoBigData” (


  1. 1.
    Adomavicius, G., Tuzhilin, A.: Using data mining methods to build customer profiles. Computer 34(2), 74–82 (2001)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM, New York (1993)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)Google Scholar
  4. 4.
    Andersen, H., Andreasen, M., Jacobsen, P.: The CRM Handbook: From Group to Multi-individual. PricewaterhouseCoopers, Norhaven (1999)Google Scholar
  5. 5.
    De Capitani Di Vimercati, S., Foresti, S., Livraga, G., Samarati, P.: Data privacy definitions and techniques. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 20, 793–817 (2012)CrossRefGoogle Scholar
  6. 6.
    Deng, M., Wuyts, K., Scandariato, R., Preneel, B., Joosen, W.: A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requir. Eng. 16(1), 3–32 (2011)CrossRefGoogle Scholar
  7. 7.
    Dunk, A.S.: Product life cycle cost analysis: the impact of customer profiling, competitive advantage, and quality of is information. Manag. Account. Res. 15(4), 401–414 (2004)CrossRefGoogle Scholar
  8. 8.
    Giannotti, F., Gozzi, C., Manco, G.: Clustering transactional data. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 175–187. Springer, Heidelberg (2002). Scholar
  9. 9.
    Guidotti, R.: Personal data analytics: capturing human behavior to improve self-awareness and personal services through individual and collective knowledge (2017)Google Scholar
  10. 10.
    Guidotti, R., Coscia, M., Pedreschi, D., Pennacchioli, D.: Behavioral entropy and profitability in retail. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2015). 36678 2015Google Scholar
  11. 11.
    Guidotti, R., Gabrielli, L.: Recognizing residents and tourists with retail data using shopping profiles. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 353–363. Springer, Cham (2018). Scholar
  12. 12.
    Guidotti, R., Gabrielli, L., Monreale, A., Pedreschi, D., Giannotti, F.: Discovering temporal regularities in retail customers’ shopping behavior. EPJ Data Sci. 7(1), 6 (2018)CrossRefGoogle Scholar
  13. 13.
    Guidotti, R., Monreale, A., Nanni, M., Giannotti, F., Pedreschi, D.: Clustering individual transactional data for masses of users. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 195–204. ACM, New York (2017)Google Scholar
  14. 14.
    Guidotti, R., Rossetti, G., Pappalardo, L., Giannotti, F., Pedreschi, D.: Market basket prediction using user-centric temporal annotated recurring sequences. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 895–900. IEEE (2017)Google Scholar
  15. 15.
    Guo, L., Guo, S., Wu, X.: Privacy preserving market basket data analysis. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 103–114. Springer, Heidelberg (2007). Scholar
  16. 16.
    Hildebrandt, M.: Defining profiling: a new type of knowledge? In: Hildebrandt, M., Gutwirth, S. (eds.) Profiling the European Citizen, pp. 17–45. Springer, Dordrecht (2008). Scholar
  17. 17.
    Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey. Esprit SDC Project, Deliverable MI-3 D, 2:1999 (1999)Google Scholar
  18. 18.
    Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, vol. 1, pp. 727–734 (2000)Google Scholar
  19. 19.
    Pellungrini, R., Pappalardo, L., Pratesi, F., Monreale, A.: A data mining approach to assess privacy risk in human mobility data. ACM Trans. Intell. Syst. Technol. 9(3), 31:1–31:27 (2017)CrossRefGoogle Scholar
  20. 20.
    Pellungrini, R., Pratesi, F., Pappalardo, L.: Assessing privacy risk in retail data. In: Guidotti, R., Monreale, A., Pedreschi, D., Abiteboul, S. (eds.) PAP 2017. LNCS, vol. 10708, pp. 17–22. Springer, Cham (2017). Scholar
  21. 21.
    Poulis, G., Loukides, G., Gkoulalas-Divanis, A., Skiadopoulos, S.: Anonymizing data with relational and transaction attributes. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 353–369. Springer, Heidelberg (2013). Scholar
  22. 22.
    Pratesi, F., Monreale, A., Trasarti, R., Giannotti, F., Pedreschi, D., Yanagihara, T.: PRISQUIT: a system for assessing privacy risk versus quality in data sharing. Technical report (2016)Google Scholar
  23. 23.
    Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1998, p. 188. ACM, New York (1998)Google Scholar
  24. 24.
    Spruill, N.: The confidentiality and analytic usefulness of masked business microdata. In: Proceedings of the Section on Survey Research Methods, pp. 602–607 (1983)Google Scholar
  25. 25.
    Swiderski, F., Snyder, W.: Threat Modeling. O’Reilly Media, Sebastopol (2004)Google Scholar
  26. 26.
    Torra, V., Abowd, J.M., Domingo-Ferrer, J.: Using Mahalanobis distance-based record linkage for disclosure risk assessment. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 233–242. Springer, Heidelberg (2006). Scholar
  27. 27.
    Trabelsi, S., Salzgeber, V., Bezzi, M., Montagnon, G.: Data disclosure risk evaluation. In: CRiSIS 2009, pp. 35–72 (2009)Google Scholar
  28. 28.
    Trasarti, R., Guidotti, R., Monreale, A., Giannotti, F.: MyWay: location prediction via mobility profiling. Inf. Syst. 64, 350–367 (2017)CrossRefGoogle Scholar
  29. 29.
    Tseng, V.S., Wu, C., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016)CrossRefGoogle Scholar
  30. 30.
    Wang, L., Li, X.: Personalized privacy protection for transactional data. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS (LNAI), vol. 8933, pp. 253–266. Springer, Cham (2014). Scholar
  31. 31.
    Weng, S.-S., Liu, M.-J.: Feature-based recommendations for one-to-one marketing. Expert Syst. Appl. 26(4), 493–508 (2004)CrossRefGoogle Scholar
  32. 32.
    Xu, Y., Fung, B.C.M., Wang, K., Fu, A.W., Pei, J.: Publishing sensitive transactions for itemset utility. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), Pisa, Italy, 15–19 December 2008, pp. 1109–1114 (2008)Google Scholar
  33. 33.
    Xu, Y., Wang, K., Fu, A.W., Yu, P.S.: Anonymizing transaction databases for publication. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 24–27 August 2008, pp. 767–775 (2008)Google Scholar
  34. 34.
    Yarovoy, R., Bonchi, F., Lakshmanan, L.V.S., Wang, W.H.: Anonymizing moving objects: how to hide a mob in a crowd? In: EDBT, pp. 72–83 (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Roberto Pellungrini
    • 1
    Email author
  • Anna Monreale
    • 1
  • Riccardo Guidotti
    • 1
    • 2
  1. 1.University of PisaPisaItaly
  2. 2.KDDLab, ISTI-CNRPisaItaly

Personalised recommendations