Advertisement

A Supervised Auto-Tuning Approach for a Banking Fraud Detection System

  • Michele Carminati
  • Luca Valentini
  • Stefano Zanero
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10332)

Abstract

In this paper, we propose an extension to Banksealer, one of the most recent and effective banking fraud detection systems. In particular, until now Banksealer was unable to exploit analyst feedback to self-tune and improve its performance. It also depended on a complex set of parameters that had to be tuned by hand before operations.

To overcome both these limitations, we propose a supervised evolutionary wrapper approach, that considers analyst’s feedbacks on fraudulent transactions to automatically tune feature weighting and improve Banksealer’s detection performance. We do so by means of a multi-objective genetic algorithm.

We deployed our solution in a real-world setting of a large national banking group and conducted an in-depth experimental evaluation. We show that the proposed system was able to detect sophisticated frauds, improving Banksealer’s performance of up to 35% in some cases.

Keywords

Internet banking Fraud detection Genetic algorithm Supervised learning 

Notes

Acknowledgment

This work has received funding from the European Union’s Horizon 2020 Programme, under grant agreement 700326 “RAMSES”, as well as from projects co-funded by the Lombardy region and Secure Network S.r.l.

References

  1. 1.
    Kaspersky Security Bulletin 2016. Technical report, Kaspersky Lab (2017). https://goo.gl/Jzkab2
  2. 2.
    Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: AAAI, vol. 91, pp. 547–552. Citeseer (1991)Google Scholar
  3. 3.
    Amer, M., Goldstein, M.: Nearest-neighbor and clustering based anomaly detection algorithms for RapidMiner. In: Proceedings of the 3rd RapidMiner Community Meeting and Conference (RCOMM 2012), pp. 1–12 (2012)Google Scholar
  4. 4.
    Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17 (2002)Google Scholar
  5. 5.
    Bolton, R.J., Hand, D.J., David J.H.: Unsupervised profiling methods for fraud detection. In: Proceedings of Credit Scoring and Credit Control VII, pp. 5–7 (2001)Google Scholar
  6. 6.
    Cardie, C.: Using decision trees to improve case-based learning. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 25–32 (1993)Google Scholar
  7. 7.
    Carminati, M., Caron, R., Maggi, F., Epifani, I., Zanero, S.: BankSealer: an online banking fraud analysis and decision support system. In: ICT Systems Security and Privacy Protection. IFIP Advances in Information and Communication Technology, vol. 428, pp. 380–394. Springer, Heidelberg (2014)Google Scholar
  8. 8.
    Carminati, M., Caron, R., Maggi, F., Epifani, I., Zanero, S.: BankSealer: a decision support system for online banking fraud analysis and investigation. Comput. Secur. 53, 175–186 (2015) http://dx.doi.org/10.1016/j.cose.2015.04.002
  9. 9.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009)Google Scholar
  10. 10.
    Cost, S., Salzberg, S.: A weighted nearest neighbor algorithm for learning with symbolic features. Mach. Learn. 10(1), 57–78 (1993)Google Scholar
  11. 11.
    Deb, K., Agrawal, R.B.: Simulated binary crossover for continuous search space. Complex Syst. 9(3), 1–15 (1994)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  13. 13.
    Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer Science & Business Media, New York (2003)Google Scholar
  14. 14.
    Goldstein, M., Dengel, A.: Histogram-Based Outlier Score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: Poster and Demo Track, pp. 59–63 (2012)Google Scholar
  15. 15.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems Series. Elsevier Science & Tech, Amsterdam (2006)zbMATHGoogle Scholar
  16. 16.
    Jirapech-Umpai, T., Aitken, S.: Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes. BMC Bioinform. 6(1), 148 (2005)CrossRefGoogle Scholar
  17. 17.
    Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)Google Scholar
  18. 18.
    Kohavi, R., John, G.H.: Automatic parameter selection by minimizing estimated error. In: ICML, pp. 304–312. Citeseer (1995)Google Scholar
  19. 19.
    Kohavi, R., John, G.H.: The wrapper approach. In: Feature Extraction, Construction and Selection, pp. 33–50. Springer (1998)Google Scholar
  20. 20.
    Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi: 10.1007/3-540-57868-4_57 CrossRefGoogle Scholar
  21. 21.
    Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Morgan Kaufmann Publishers Inc. (1994)Google Scholar
  22. 22.
    Mahalanobis, P.C.: On the generalized distance in statistics. In: Proceedings of the National Institute of Science of India, vol. 2, pp. 49–55 (1936)Google Scholar
  23. 23.
    Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9(3), 193–212 (1995)MathSciNetGoogle Scholar
  24. 24.
    Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1998)Google Scholar
  25. 25.
    Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)Google Scholar
  26. 26.
    Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)CrossRefGoogle Scholar
  27. 27.
    Obayashi, S., Takahashi, S., Takeguchi, Y.: Niching and elitist models for MOGAs. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 260–269. Springer, Heidelberg (1998). doi: 10.1007/BFb0056869 CrossRefGoogle Scholar
  28. 28.
    Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 1, pp. 568–571. IEEE (2002)Google Scholar
  29. 29.
    Parks, G.T., Miller, I.: Selective breeding in a multiobjective genetic algorithm. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 250–259. Springer, Heidelberg (1998). doi: 10.1007/BFb0056868 CrossRefGoogle Scholar
  30. 30.
    Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)Google Scholar
  31. 31.
    Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. SIGKDD Explor. Newsl. 6(1), 50–59 (2004)Google Scholar
  32. 32.
    Phuong, T.M., Lin, Z., Altman, R.B.: Choosing SNPs using feature selection. In: Proceedings of Computational Systems Bioinformatics Conference, 2005, pp. 301–309. IEEE (2005)Google Scholar
  33. 33.
    Punch III, W.F., Goodman, E.D., Pei, M., Chia-Shun, L., Hovland, P.D., Enbody, R.J.: Further research on feature selection and classification using genetic algorithms. In: ICGA, pp. 557–564 (1993)Google Scholar
  34. 34.
    Rudolph, G.: Evolutionary search under partially ordered sets. Dept. Comput. Sci./LS11, Univ. Dortmund, Dortmund, Germany, Technical report CI-67/99 (1999)Google Scholar
  35. 35.
    Soyel, H., Tekguc, U., Demirel, H.: Application of NSGA-II to feature selection for facial expression recognition. Comput. Electr. Eng. 37(6), 1232–1240 (2011)CrossRefGoogle Scholar
  36. 36.
    Wei, W., Li, J., Cao, L., Ou, Y., Chen, J.: Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 16(4), 449–475 (2013). http://dx.doi.org/10.1007/s11280-012-0178-0
  37. 37.
    Wettschereck, D., Aha, D.W., Mohri, T.: A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif. Intell. Rev. 11(1–5), 273–314 (1997)CrossRefGoogle Scholar
  38. 38.
    Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. In: Liu, H., Motoda, H. (eds.) Feature Extraction, Construction and Selection, pp. 117–136. Springer, New York (1998)CrossRefGoogle Scholar
  39. 39.
    Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Michele Carminati
    • 1
  • Luca Valentini
    • 1
  • Stefano Zanero
    • 1
  1. 1.Dipartimento di Elettronica, Informazione e BioingegneriaPolitecnico di MilanoMilanItaly

Personalised recommendations