An Efficient Method for Detecting Fraudulent Transactions Using Classification Algorithms on an Anonymized Credit Card Data Set

  • Sylvester Manlangit
  • Sami Azam
  • Bharanidharan Shanmugam
  • Krishnan Kannoorpatti
  • Mirjam Jonkman
  • Arasu Balasubramaniam
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 736)


Credit card fraudulent transactions are causing businesses and banks to lose time and money. Detecting fraudulent transactions before a transaction is finalized will help businesses and banks to save resources. This research aims to compare the fraud detection accuracy of different sampling techniques and classification algorithms. An efficient method of detecting fraud using machine learning is proposed. Anonymized data set from Kaggle was used for detecting fraudulent transactions. Each transaction has been labeled as either a fraudulent transaction or not. The severe imbalance between fraud and non-fraudulent data caused the algorithms to under-perform. This was addressed with the application of sampling techniques. The combination of undersampling and SMOTE raised the recall accuracy of the classification algorithm. k-NN algorithm showed the highest recall accuracy compared to the other algorithms.


Credit card Anonymized data Fraud detection 



We would like to thank School of Engineering and IT, Charles Darwin University for providing funding and assistance for this research.


  1. 1.
    Jha, S., Westland, J.C.: A descriptive study of credit card fraud pattern. Glob. Bus. Rev. 14, 373–384 (2013)CrossRefGoogle Scholar
  2. 2.
    Liñares-Zegarra, J., Wilson, J.O.S.: Credit card interest rates and risk: new evidence from US survey data. Eur. J. Financ. 20, 892–914 (2014)CrossRefGoogle Scholar
  3. 3.
    Lepoivre, M.R., Avanzini, C.O., Bignon, G., Legendre, L., Piwele, A.K.: Credit card fraud detection with unsupervised algorithms (Report). J. Adv. Inf. Technol. 7, 34 (2016)CrossRefGoogle Scholar
  4. 4.
    Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Dec. Support Syst. 50, 602–613 (2011)CrossRefGoogle Scholar
  5. 5.
    Prakash, C.: A parameter optimized approach for improving credit card fraud detection. Int. J. Comput. Sci. Issues 10, 360–366 (2013)Google Scholar
  6. 6.
    Venkata Ratnam, G., Siva Naga Prasad, M.: Credit card fraud detection using anti-k nearest neighbor algorithm. Int. J. Comput. Sci. Eng. 4, 1035–1039 (2012)Google Scholar
  7. 7.
    Correa Bahnsen, A., Aouada, D., Stojanovic, A., Ottersten, B.: Feature engineering strategies for credit card fraud detection. Exp. Syst. Appl. 51, 134–142 (2016)CrossRefGoogle Scholar
  8. 8.
    Dal Pozzolo, A., Caelen, O., Le Borgne, Y.-A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Exp. Syst. Appl. 41, 4915–4928 (2014)CrossRefGoogle Scholar
  9. 9.
    Lee, Y.J., Yeh, Y.R., Wang, Y.C.F.: Anomaly detection via online oversampling principal component analysis. IEEE Trans. Knowl. Data Eng. 25, 1460–1470 (2013)CrossRefGoogle Scholar
  10. 10.
  11. 11.
    Dal Pozzolo, A., Caelen, O., Johnson, R.A., Bontempi, G.: Calibrating probability with undersampling for unbalanced classification. In: 2015 IEEE Symposium Series on Computational Intelligence, pp. 159–166. IEEE (2015)Google Scholar
  12. 12.
    Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 39, 539–550 (2009)CrossRefGoogle Scholar
  13. 13.
    Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: European Conference on Machine Learning, pp. 39–50. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)zbMATHGoogle Scholar
  15. 15.
    Khyati, C., Bhawna, M.: Exploration of Data mining techniques in fraud detection: credit card. Int. J. Electron. Comput. Sci. Eng. 1, 1765–1771 (2012)Google Scholar
  16. 16.
    Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook, pp. 853–867. Springer, Boston (2005)Google Scholar
  17. 17.
    Nadarajan, S., Ramanujam, B.: Encountering imbalance in credit card fraud detection with metaheuristics. Adv. Nat. Appl. Sci. 10, 33–41 (2016)Google Scholar
  18. 18.
    Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2, 18–22 (2002)Google Scholar
  19. 19.
    Le Cessie, S., Van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. 41, 191–201 (1992)CrossRefzbMATHGoogle Scholar
  20. 20.
  21. 21.
  22. 22.
  23. 23.
    Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)Google Scholar
  24. 24.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Sylvester Manlangit
    • 1
  • Sami Azam
    • 1
  • Bharanidharan Shanmugam
    • 1
  • Krishnan Kannoorpatti
    • 1
  • Mirjam Jonkman
    • 1
  • Arasu Balasubramaniam
    • 2
  1. 1.School of Engineering and ITCharles Darwin UniversityDarwinAustralia
  2. 2.Cookie Analytix Pvt. Ltd.BengaluruIndia

Personalised recommendations