Solving the False Positives Problem in Fraud Prediction Using Automated Feature Engineering

  • Roy Wedge
  • James Max KanterEmail author
  • Kalyan Veeramachaneni
  • Santiago Moral Rubio
  • Sergio Iglesias Perez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11053)


In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. False positives plague the fraud prediction industry. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. To address this problem, we use the Deep Feature Synthesis algorithm to automatically derive behavioral features based on the historical data of the card associated with a transaction. We generate 237 features (>100 behavioral patterns) for each transaction, and use a random forest to learn a classifier. We tested our machine learning model on data from a large multinational bank and compared it to their existing solution. On an unseen data of 1.852 million transactions, we were able to reduce the false positives by 54% and provide a savings of 190K euros. We also assess how to deploy this solution, and whether it necessitates streaming computation for real time scoring. We found that our solution can maintain similar benefits even when historical features are computed once every 7 days.


  1. 1.
    Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011)CrossRefGoogle Scholar
  2. 2.
    Brause, R., Langsdorf, T., Hepp, M.: Neural data mining for credit card fraud detection. In: Proceedings of 11th IEEE International Conference on Tools with Artificial Intelligence, pp. 103–106. IEEE (1999)Google Scholar
  3. 3.
    Carcillo, F., Dal Pozzolo, A., Le Borgne, Y.A., Caelen, O., Mazzer, Y., Bontempi, G.: SCARFF: a scalable framework for streaming credit card fraud detection with spark. Inf. Fusion 41, 182–194 (2017)CrossRefGoogle Scholar
  4. 4.
    Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. Appl. 14(6), 67–74 (1999)CrossRefGoogle Scholar
  5. 5.
    Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)CrossRefGoogle Scholar
  6. 6.
    Feature Labs, I.: Featuretools: automated feature engineering (2017)Google Scholar
  7. 7.
    Ghosh, S., Reilly, D.L.: Credit card fraud detection with a neural-network. In: Proceedings of the Twenty-Seventh Hawaii International Conference on System Science, vol. 3, pp. 621–630. IEEE (1994)Google Scholar
  8. 8.
    Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: towards automating data science endeavors. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA), 36678 2015, pp. 1–10. IEEE (2015)Google Scholar
  9. 9.
    Panigrahi, S., Kundu, A., Sural, S., Majumdar, A.K.: Credit card fraud detection: a fusion approach using dempster-shafer theory and Bayesian learning. Inf. Fusion 10(4), 354–363 (2009)CrossRefGoogle Scholar
  10. 10.
    Pascual, A., Marchini, K., Van Dyke, A.: Overcoming false positives: saving the sale and the customer relationship. In: Javelin Strategy and Research Reports (2015)Google Scholar
  11. 11.
    Shen, A., Tong, R., Deng, Y.: Application of classification models on credit card fraud detection. In: 2007 International Conference on Service Systems and Service Management, pp. 1–4. IEEE (2007)Google Scholar
  12. 12.
    Stolfo, S., Fan, D.W., Lee, W., Prodromidis, A., Chan, P.: Credit card fraud detection using meta-learning: issues and initial results. In: AAAI-97 Workshop on Fraud Detection and Risk Management (1997)Google Scholar
  13. 13.
    Whitrow, C., Hand, D.J., Juszczak, P., Weston, D., Adams, N.M.: Transaction aggregation as a strategy for credit card fraud detection. Data Min. Knowl. Discov. 18(1), 30–55 (2009)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Roy Wedge
    • 1
  • James Max Kanter
    • 1
    Email author
  • Kalyan Veeramachaneni
    • 1
  • Santiago Moral Rubio
    • 2
  • Sergio Iglesias Perez
    • 2
  1. 1.Data to AI Lab, LIDSMITCambridgeUSA
  2. 2.Banco Bilbao Vizcaya Argentaria (BBVA)MadridSpain

Personalised recommendations