Privacy-Enhanced Fraud Detection with Bloom Filters

  • Daniel Arp
  • Erwin Quiring
  • Tammo Krueger
  • Stanimir Dragiev
  • Konrad Rieck
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 254)


The online shopping sector is continuously growing, generating a turnover of billions of dollars each year. Unfortunately, this growth in popularity is not limited to regular customers: Organized crime targeting online shops has considerably evolved in the past years, causing significant financial losses to the merchants. As criminals often use similar strategies among different merchants, sharing information about fraud patterns could help mitigate the success of these malicious activities. In practice, however, the sharing of data is difficult, since shops are often competitors or have to follow strict privacy laws. In this paper, we propose a novel method for fraud detection that allows merchants to exchange information on recent fraud incidents without exposing customer data. To this end, our method pseudonymizes orders on the client-side before sending them to a central service for analysis. Although the service cannot access individual features of these orders, it is able to infer fraudulent patterns using machine learning techniques. We examine the capabilities of this approach and measure its impact on the overall detection performance on a dataset of more than 1.5 million orders from a large European online fashion retailer.



The authors would like to thank Alwin Maier and Paul Schmidt for their assistance during the research project. Moreover, the authors gratefully acknowledge funding from the German Federal Ministry of Education and Research (BMBF) under the project ABBO (FKZ: 13N13634).


  1. 1.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefGoogle Scholar
  2. 2.
    Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  3. 3.
    Bursztein, E., et al.: Handcrafted fraud and extortion: manual account hijacking in the wild. In: Proceedings of Conference on Internet Measurement Conference (IMC) (2014)Google Scholar
  4. 4.
    Bursztein, E., Malyshev, A., Pietraszek, T., Thomas, K.: Picasso: lightweight device class fingerprinting for web clients. In: Proceedings of ACM Workshop on Security and Privacy in Smartphones and Mobile Devices (SPSM) (2016)Google Scholar
  5. 5.
    Caldeira, E., Brandao, G., Pereira, A.C.M.: Fraud analysis and prevention in e-commerce transactions. In: Proceedings of Latin American Web Congress (LA-WEB) (2014)Google Scholar
  6. 6.
    Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. 14(6), 67–74 (1999)CrossRefGoogle Scholar
  7. 7.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2016)Google Scholar
  8. 8.
    Damashek, M.: Gauging similarity with \(n\)-grams: language-independent categorization of text. Science 267(5199), 843–848 (1995)CrossRefGoogle Scholar
  9. 9.
    Duda, R., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Hoboken (2001)zbMATHGoogle Scholar
  10. 10.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. (JMLR) 9, 1871–1874 (2008)zbMATHGoogle Scholar
  11. 11.
    Fanti, G., Pihur, V., Erlingsson, Ú.: Building a RAPPOR with the unknown: privacy-preserving learning of associations and data dictionaries. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2016)Google Scholar
  12. 12.
    Florencio, D., Herley, C.: Phishing and money mules. In: Proceedings of IEEE International Workshop on Information Forensics and Security (WIFS) (2010)Google Scholar
  13. 13.
    Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of USENIX Security Symposium (2014)Google Scholar
  14. 14.
    Hao, S., et al.: Drops for stuff: an analysis of reshipping mule scams. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2015)Google Scholar
  15. 15.
    Kroll, M., Steinmetzer, S.: Automated cryptanalysis of bloom filter encryptions of health records. In: Proceedings of the International Conference on Health Informatics (HEALTHINF) (2015)Google Scholar
  16. 16.
    Kroll, M., Steinmetzer, S., Niedermeyer, F., Schnell, R.: Cryptanalysis of basic bloom filters used for privacy preserving record linkage. J. Priv. Confidentiality 6(2), 59–79 (2014)Google Scholar
  17. 17.
    Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of bloom filters in private record linkage. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2011)Google Scholar
  18. 18.
    Levchenko, K., et al.: Click trajectories: end-to-end analysis of the spam value chain. In: Proceedings of IEEE Symposium on Security and Privacy (2011)Google Scholar
  19. 19.
    LexisNexis: True cost of fraud study (2016)Google Scholar
  20. 20.
    Maranzato, R., Pereira, A., do Lago, A.P., Neubert, M.: Fraud detection in reputation systems in e-markets using logistic regression. In: Proceedings of ACM Symposium on Applied Computing (SAC) (2010)Google Scholar
  21. 21.
    Mor, N., Riva, O., Nath, S., Kubiatowicz, J.: Bloom cookies: web search personalization without user tracking. In: Proceedings of Network and Distributed System Security Symposium (NDSS) (2015)Google Scholar
  22. 22.
    Motoyama, M., McCoy, D., Levchenko, K., Savage, S., Voelker, G.M.: Dirty jobs: the role of freelance labor in web service abuse. In: Proceedings of USENIX Security Symposium (2011)Google Scholar
  23. 23.
    Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559–569 (2011)CrossRefGoogle Scholar
  24. 24.
    Nikiforakis, N., Kapravelos, A., Joosen, W., Kruegel, C., Piessens, F., Vigna, G.: Cookieless monster: exploring the ecosystem of web-based device fingerprinting. In: Proceedings of IEEE Symposium on Security and Privacy (2013)Google Scholar
  25. 25.
    Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: NetProbe: a fast and scalable system for fraud detection in online auction networks. In: Proceedings of the International World Wide Web Conference (WWW) (2007)Google Scholar
  26. 26.
    Perl, H., Yassene, M., Brenner, M., Smith, M.: Fast confidential search for bio-medical data using bloom filters and homomorphic cryptography. In: International Conference on eScience (2012)Google Scholar
  27. 27.
    Preuveneers, D., Goosens, B., Joosen, W.: Enhanced fraud detection as a service supporting merchant-specific runtime customization. In: Proceedings of ACM Symposium on Applied Computing (SAC) (2017)Google Scholar
  28. 28.
    Schneier, B.: Applied Cryptography. Wiley, Hoboken (1996)Google Scholar
  29. 29.
    Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using bloom filters. BMC Med. Inform. Decis. Mak. 9, 41 (2009)CrossRefGoogle Scholar
  30. 30.
    Shay, R., Ion, I., Reeder, R.W., Consolvo, S.: “My religious aunt asked why I was trying to sell her viagra”: experiences with account hijacking. In: Proceedings of ACM Conference on Human Factors in Computing Systems (CHI) (2014)Google Scholar
  31. 31.
    Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2015)Google Scholar
  32. 32.
    Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: Proceedings of IEEE Symposium on Security and Privacy (2017)Google Scholar
  33. 33.
    Sokal, R., Sneath, P.: Principles of Numerical Taxonomy. W.H. Freeman and Company, New York (1963)zbMATHGoogle Scholar
  34. 34.
    Statista: Net sales revenue of Amazon from 2004 to 2017 (2018). Accessed April 2018
  35. 35.
    Thomas, K., Iatskiv, D., Bursztein, E., Pietraszek, T., Grier, C., McCoy, D.: Dialing back abuse on phone verified accounts. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2014)Google Scholar
  36. 36.
    Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIS. In: Proceedings of USENIX Security Symposium (2017)Google Scholar
  37. 37.
    Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. Inf. Syst. 38(6), 946–969 (2013)CrossRefGoogle Scholar
  38. 38.
    Worldpay: Fragmentation of fraud (2014)Google Scholar
  39. 39.
    Wu, D.J., Feng, T., Naehrig, M., Lauter, K.E.: Privately evaluating decision trees and random forests. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2016)Google Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018

Authors and Affiliations

  • Daniel Arp
    • 1
  • Erwin Quiring
    • 1
  • Tammo Krueger
    • 2
  • Stanimir Dragiev
    • 2
  • Konrad Rieck
    • 1
  1. 1.Technische Universität BraunschweigBraunschweigGermany
  2. 2.Zalando Payments GmbHBerlinGermany

Personalised recommendations