Incremental learning strategies for credit cards fraud detection

Lebichot, B.; Paldino, G. M.; Siblini, W.; He-Guelton, L.; Oblé, F.; Bontempi, G.

doi:10.1007/s41060-021-00258-0

Incremental learning strategies for credit cards fraud detection

Regular Paper
Published: 09 June 2021

Volume 12, pages 165–174, (2021)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

B. Lebichot ORCID: orcid.org/0000-0003-2188-0118¹,
G. M. Paldino¹,
W. Siblini²,
L. He-Guelton¹,
F. Oblé² &
…
G. Bontempi¹

977 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Every second, thousands of credit or debit card transactions are processed in financial institutions. This extensive amount of data and its sequential nature make the problem of fraud detection particularly challenging. Most analytical strategies used in production are still based on batch learning, which is inadequate for two reasons: Models quickly become outdated and require sensitive data storage. The evolving nature of bank fraud enshrines the importance of having up-to-date models, and sensitive data retention makes companies vulnerable to infringements of the European General Data Protection Regulation. For these reasons, evaluating incremental learning strategies is recommended. This paper designs and evaluates incremental learning solutions for real-world fraud detection systems. The aim is to demonstrate the competitiveness of incremental learning over conventional batch approaches and, consequently, improve its accuracy employing ensemble learning, diversity and transfer learning. An experimental analysis is conducted on a full-scale case study including five months of e-commerce transactions and made available by our industry partner, Worldline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Credit Card Fraud Detection Using Machine Learning and Incremental Learning

The role of diversity and ensemble learning in credit card fraud detection

Article 28 September 2022

Parallel and incremental credit card fraud detection model to handle concept drift and data imbalance

Article 17 July 2018

Notes

Though this applies to continuous features, discrete features may be re-encoded to get continuous features, see [39] for details.
Acronyms in Table 3 follow this convention: “E” stands for ensemble of NNs, “D” for diversity criterion and “T” indicates we used transfer learning.

References

HSN Consultants, Inc, The nilson report 2019 (consulted on 2020-03-17). https://nilsonreport.com
Abdallah, A., Maarof, M.A., Zainal, A.: Fraud detection system?: A survey. J. Netw. Comput. Appl. 68, 90–113 (2016)
Article Google Scholar
Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: A comparative study. Decis. Support Syst. 50(3), 602–613 (2011)
Article Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. learn. 23(1), 69–101 (1996)
Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent, In: Proceedings of the 20th international conference on machine learning (icml-03), pp. 928–936 (2003)
Hazan, E., Agarwal, A., Kale, S.: Logarithmic regret algorithms for online convex optimization. Mach. Learn. 69(2–3), 169–192 (2007)
Article Google Scholar
Dal Pozzolo, A., Caelen, O., Le Borgne, Y.-A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Expert Syst. Appl. 10(41), 4915–4928 (2014)
Article Google Scholar
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Inf. Fusion 6(1), 5–20 (2005)
Article Google Scholar
Sun, Y., Tang, K., Zhu, Z., Yao, X.: Concept drift adaptation by exploiting historical knowledge. IEEE Trans. Neural Netw. Learn. Syst 29(10), 4822–4832 (2018)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
W. N. Street, Y. Kim, A streaming ensemble algorithm (sea) for large-scale classification, In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 377–382
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Article Google Scholar
S. Ghosh, D. L. Reilly, Credit card fraud detection with a neural-network, in: System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on, Vol. 3, IEEE, 1994, pp. 621–630
Dorronsoro, J.R., Ginel, F., Sgnchez, C., Cruz, C.S.: Neural fraud detection in credit card operations. IEEE Trans. Neural Netw. 8(4), 827–834 (1997)
Article Google Scholar
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: A review. Neural Netw. (2019)
Fu, K., Cheng, D., Tu, Y., Zhang, L.: Credit card fraud detection using convolutional neural networks, In: International Conference on Neural Information Processing, Springer, pp. 483–490 (2016)
Pumsirirat, A., Yan, L.: Credit card fraud detection using deep learning based on auto-encoder and restricted boltzmann machine. Int. J. Adv. Comput. Sci. Appl. 9(1), 18–25 (2018)
Google Scholar
Abakarim, Y., Lahby, M., Attioui, A.: An efficient real time model for credit card fraud detection based on deep learning, In: Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications, pp. 1–7 (2018)
Nguyen, T.T., Tahir, H., Abdelrazek, M., Babar, A.: Deep learning methods for credit card fraud detection, arXiv preprint arXiv:2012.03754 (2020)
Najadat, H., Altiti, O., Aqouleh, A.A., Younes, M.: Credit card fraud detection based on machine and deep learning, In: 2020 11th International Conference on Information and Communication Systems (ICICS), IEEE, pp. 204–208 (2020)
Forough, J., Momtazi, S.: Ensemble of deep sequential models for credit card fraud detection. Appl. Soft Comput. 99, (2021)
Alippi, C., Boracchi, G., Roveri, M.: Just-in-time classifiers for recurrent concepts. IEEE Trans. Neural Netw. Learn. Syst. 24(4), 620–634 (2013)
Article Google Scholar
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)
Article Google Scholar
Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.-A., Caelen, O., Mazzer, Y., Bontempi, G.: Scarff: a scalable framework for streaming credit card fraud detection with spark. Inf. Fusion 41, 182–194 (2018)
Article Google Scholar
Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PloS one 10(3), (2015)
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves, In: Proceedings of the 23rd international conference on Machine learning, pp. 233–240 (2006)
Lebichot, B., Braun, F., Caelen, O., Saerens, M.: A graph-based, semi-supervised, credit card fraud detection system, pp. 721–733. Springer, Cham (2017)
Google Scholar
Dal Pozzolo, A.: Adaptive machine learning for credit card fraud detection, Ph.D. thesis, Universite Libre de Bruxelles (2015)
Machine Learning Group - ULB, Credit card fraud detection (consulted on 2020-06-28). https://www.kaggle.com/mlg-ulb/creditcardfraud
Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.-E., He, L., Caelen, O.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100, 234–245 (2018)
Article Google Scholar
Chollet, F.: et al., Keras, https://keras.io (2015)
R. Chalapathy, S. Chawla, Deep learning for anomaly detection: A survey, arXiv preprint arXiv:1901.03407 (2019)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection, In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017)
Cui, Y., Jia, M., Lin, T.-Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9268–9277 (2019)
Japkowicz, N.: Learning from imbalanced data sets: a comparison of various strategies, In AAAI Workshop on Learning from Imbalanced Data Sets (2000)
Carcillo, F., Le Borgne, Y.-A., Caelen, O., Kessaci, Y., Oblé, F., Bontempi, G.: Combining unsupervised and supervised learning in credit card fraud detection. Inf. Sci. 557, 317–331 (2019)
Article MathSciNet Google Scholar
Demsar, J.: Statistical comparaison of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
B. Lebichot, T. Verhelst, Y.-A. Le Borgne, L. He-Guelton, F. Oblé, G. Bontempi, Transfer learning strategies for credit card fraud detection (submitted for publication)
Lebichot, B., Le Borgne, Y.-A., He-Guelton, L., Oblé, F., Bontempi, G.: Deep-learning domain adaptation techniques for credit cards fraud detection. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds.) Recent Advances in Big Data and Deep Learning, pp. 78–88. Springer International Publishing, Cham (2020)
Chapter Google Scholar
Huang, J., Smola, A.J., Gretton, A., Borgwardt, K.M., Scholkopf, B.: Correcting sample selection bias by unlabeled data, In: Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS’06, MIT Press, pp. 601–608 (2006)
Liu, Y., Yao, X.: Ensemble learning via negative correlation. Neural Netw. 12(10), 1399–1404 (1999)
Article Google Scholar
Siblini, W., Fréry, J., He-Guelton, L., Oblé, F., Wang, Y.-Q.: Master your metrics with calibration, In: International Symposium on Intelligent Data Analysis, Springer, pp. 457–469 (2020)

Download references

Funding

This work was supported by the TeamUp DefeatFraud project funded by Innoviris (2017-R-49a), Brussels, Belgium. We thank this agency for allowing us to conduct both fundamental and applied research. B. Lebichot also thanks LouRIM, Université catholique de Louvain, Belgium, for their support.

Author information

Authors and Affiliations

Machine Learning Group, Computer Science Departement, Faculty of Sciences, Université Libre de Bruxelles, Brussels, Belgium
B. Lebichot, G. M. Paldino, L. He-Guelton & G. Bontempi
Development and Innovation, Worldline,Lyon, France
W. Siblini & F. Oblé

Authors

B. Lebichot
View author publications
You can also search for this author in PubMed Google Scholar
G. M. Paldino
View author publications
You can also search for this author in PubMed Google Scholar
W. Siblini
View author publications
You can also search for this author in PubMed Google Scholar
L. He-Guelton
View author publications
You can also search for this author in PubMed Google Scholar
F. Oblé
View author publications
You can also search for this author in PubMed Google Scholar
G. Bontempi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. Lebichot.

Ethics declarations

Conflicts of interest

The authors declare they have no conflicts of interest or competing interests.

Availability of data and material

The main dataset cannot be made available for confidential reasons. A good proxy is the public Kaggle dataset [29], a two-day long, anonymized extract from the same process. Experimental results with the public dataset are reported in Sect. 5.

Code availability

Code cannot be made available for confidential reasons.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Appendix: Additional metrics

As explained in Sect. 2, taking into account the Pr@100 can be more relevant because of the limited number of investigators or investigations per day. Figures 6 and 7 summarize the results in the form of F/N tests ([37]).

Another interesting indicator can be obtained by calibrating the AUPRC. Indeed, the number of fraud to detect per day ranges from 0.095 to 0.335% (the coefficient of variation is 21%.). The AUPRC is calibrated in such a way that it is invariant to the fraud prior (see [42] for details). Figures 8 and 9 summarize the results in the form of F/N tests ([37]).

The purpose of those two additional metrics is to be extensive on the results and to provide different points of view for the evaluation. For the sake of conciseness, we will not comment all the results once again. This appendix shows that the conclusions are the same, in terms of statistical tests, regardless of the metric used: Pr@100, uncalibrated AUPRC and calibrated AUPRC. However, since those three metrics do not measure exactly the same quantities, there are small variations in the actual ordering of the methods in terms of average results.

B Appendix: Transfer learning in details

This description is based on [38] and is here to make the paper self-contained. Algorithm 4 details how transfer learning is embedded into the continuous approach. In our case, the target domain is the data used to initialize the incremental NN model and the source domain is the new batch of data. We consider here only the univariate case where each feature is transferred independently of the others.

The transfer process is a nonlinear monotonous transformation of the values of a continuous random variable X (the source data), such that the cumulative distribution function (CDF) of X after transformation matches a given CDF F (the target data).

First, we compute the value of the empirical CDF of X (noted $\hat{F}$) at each observed value $x_i, i=1,\dots , n$. The transferred value $x_i^\prime $ is then chosen such that $F(x_i^\prime ) = \hat{F}(x_i)$. We denote source examples by $x_i^{(s)}, i = 1,\dots ,n^{(s)}$ and target examples by $x_j^{(t)}, j = 1,\dots ,n^{(t)}$, with $n^{(s)}$ and $n^{(t)}$, respectively, the number of source and target examples. We also note the value of the empirical CDF as $p_i^{(s)}=\hat{F}^{(s)}(x_i^{(s)})$ and $p_j^{(t)}=\hat{F}^{(t)}(x_j^{(t)})$.

The source examples $x_i^{(s)}$ are transformed to a CDF that matches the empirical CDF $\hat{F}^{(t)}$ of the target examples. The target examples are left unmodified. For each source example $x_i^{(s)}$ and the corresponding empirical CDF value $p_i^{(s)}$, we find the two consecutive empirical CDF values $p_{j_1}^{(t)}$ and $p_{j_2}^{(t)}$ framing $p_i^{(s)}$ in the target domain:

$$\begin{aligned} p_{j_1}^{(t)} \le p_i^{(s)} < p_{j_2}^{(t)} \end{aligned}$$

with $j_1 + 1 = j_2$. $x_i^{(s)\prime }$ is then computed as the linear interpolation between the values $x_{j_1}^{(t)}$ and $x_{j_2}^{(t)}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lebichot, B., Paldino, G.M., Siblini, W. et al. Incremental learning strategies for credit cards fraud detection. Int J Data Sci Anal 12, 165–174 (2021). https://doi.org/10.1007/s41060-021-00258-0

Download citation

Received: 04 July 2020
Accepted: 26 April 2021
Published: 09 June 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s41060-021-00258-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental learning strategies for credit cards fraud detection

Abstract

Access this article

Similar content being viewed by others

Credit Card Fraud Detection Using Machine Learning and Incremental Learning

The role of diversity and ensemble learning in credit card fraud detection

Parallel and incremental credit card fraud detection model to handle concept drift and data imbalance

Notes

References

Funding