Data Mining and Knowledge Discovery

, Volume 1, Issue 3, pp 291–316

Adaptive Fraud Detection

  • Tom Fawcett
  • Foster Provost
Article

Abstract

One method for detecting fraud is to check for suspicious changes in user behavior. This paper describes the automatic design of user profiling methods for the purpose of fraud detection, using a series of data mining techniques. Specifically, we use a rule-learning program to uncover indicators of fraudulent behavior from a large database of customer transactions. Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies. Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate high-confidence alarms. The system has been applied to the problem of detecting cellular cloning fraud based on a database of call records. Experiments indicate that this automatic approach performs better than hand-crafted methods for detecting fraud. Furthermore, this approach can adapt to the changing conditions typical of fraud detection environments.

fraud detection rule learning profiling constructive induction intrusion detection applications 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aronis, J. and Provost, F. 1997. Increasing the efficiency of data mining algorithms with breadth-first marker propagation. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 119–122.Google Scholar
  2. Aronis, J., Provost, F., and Buchanan, B. 1996. Exploiting background knowledge in automated discovery. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), AAAI Press, pp. 355–358.Google Scholar
  3. Buchanan, B.G. and Mitchell, T.M. 1978. Model-directed learning of production rules. In Pattern-Directed Inference Systems, F. Hayes-Roth (Ed.), New York: Academic Press, pp. 297–312.Google Scholar
  4. Chatfield, C. 1984. The Analysis of Time Series: An Introduction (third edition). New York: Chapman and Hall.Google Scholar
  5. Clearwater, S. and Provost, F. 1990. RL4: A tool for knowledge-based induction. In Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence, IEEE CS Press, pp. 24–30.Google Scholar
  6. Davis, A. and Goyal, S. 1993. Management of cellular fraud: Knowledge-based detection, classification and prevention. In Thirteenth International Conference on Artificial Intelligence, Expert Systems and Natural Language, Avignon, France, vol. 2, pp. 155–164.Google Scholar
  7. DeMaria, R. and Gidari, A. 1996. Uncovering unsavory customers. Cellular Business, 24–30.Google Scholar
  8. Džeroski, S. 1996. Inductive logic programming and knowledge discovery in databases. Advances in Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, 117–152.Google Scholar
  9. Ezawa, K. and Norton, S. 1995. Knowledge discovery in telecommunication services data using bayesian network models. In Proceedings of First International Conference on Knowledge Discovery and Data Mining, U. Fayyad and R. Uthurusamy (Eds.), Menlo Park, CA: AAAI Press, pp. 100–105.Google Scholar
  10. Ezawa, K. and Norton, S. 1996. Constructing Bayesian networks to predict uncollectible telecommunications accounts. IEEE Expert, 45–51.Google Scholar
  11. Farnum, N. and Stanton, L. 1989. Quantitative Forecasting Methods. Boston, MA: PWS-Kent Publishing Company.Google Scholar
  12. Fawcett, T. and Provost, F. 1996. Combining data mining and machine learning for effective user profiling. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), E. Simoudis, J. Han, and U. Fayyad (Eds.), Menlo Park, CA: AAAI Press, pp. 8–13.Google Scholar
  13. Frank, J. 1994. Machine learning and intrustion detection: Current and future directions. In National Computer Security Conference, vol. 1, pp. 22–33. Available via http://seclab.cs.ucdavis.edu/papers/ ncsc.94.ps.Google Scholar
  14. Herzog, J. 1995. Beware of hurricane clone. Newaves. Available from http://www.pcia.com/1195l.htm.Google Scholar
  15. Kittler, J. 1986. Feature selection and extraction. In Handbook of Pattern Recognition and Image Processing, K.S. Fu (Ed.), New York: Academic Press, pp. 59–83.Google Scholar
  16. Kumar, S. 1995. A Pattern Matching Approach to Misuse Intrusion Detection. Ph.D. thesis, Purdue University, Department of Computer Sciences. Available via ftp://coast.cs.purdue.edu/pub/COAST/kumarphd-intdet.ps.gz.Google Scholar
  17. Nilsson, N.J. 1965. Learning Machines. New York: McGraw-Hill.Google Scholar
  18. Provost, F. and Aronis, J. 1996. Scaling up inductive learning with massive parallelism. Machine Learning, 23:33–46.Google Scholar
  19. Provost, F. and Fawcett, T. 1997. Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 43–48.Google Scholar
  20. Quinlan, J. 1990. Learning logical definitions from relations. Machine Learning, 5:239–266.Google Scholar
  21. Quinlan, J.R. 1987. Generating production rules from decision trees. In Proceedings of the Tenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 304–307.Google Scholar
  22. Rabiner, L.R. and Juang, B.H. 1986. An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1):4–16.Google Scholar
  23. Redden, M. 1996. A technical search for solutions. Cellular Business, 84–87.Google Scholar
  24. Segal, R. and Etzioni, O. 1994. Learning decision lists using homogeneous rules. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 619–625.Google Scholar
  25. Smyth, P. 1994. Hidden Markov models for fault detection in dynamic systems. Pattern Recognition, 27(1):149–164.Google Scholar
  26. Steward, S. 1997. Lighting the way in '97. Cellular Business, 23.Google Scholar
  27. Stolfo, S., Prodromidis, A., Tselepsis, S., Lee,W., Fan, D., and Chan, P. 1997. JAM: Java agents for meta-learning over distributed databases. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 74–81.Google Scholar
  28. Sundaram, A. 1996. An introduction to intrusion detection. ACM Crossroads-Special Issue on Computer Security, 2(4). Available from http://www.acm.org/crossroads/xrds2-4/intrus.html.Google Scholar
  29. Walters, D. and Wilkinson, W. 1994. Wireless fraud, now and in the future: A view of the problem and some solutions. Mobile Phone News, 4–7.Google Scholar
  30. Webb, G. 1995. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:383–417.Google Scholar
  31. Young, P. 1984. Recursive Estimation and Time-Series Analysis. New York: Springer-Verlag.Google Scholar
  32. Yuhas, B.P. 1993. Toll-fraud detection. In Proceedings of the International Workshop on Applications of Neural Networks to Telecommunications, J. Alspector, R. Goodman, and T. Brown (Eds.), Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 239–244.Google Scholar
  33. Yuhas, B.P. 1995. Learning the structure of telecommunications fraud. Technical report, Bellcore.Google Scholar

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • Tom Fawcett
    • 1
    • 1
  • Foster Provost
    • 1
    • 1
  1. 1.Nynex Science and TechnologyWhite Plains

Personalised recommendations