Skip to main content
Log in

Abstract

One method for detecting fraud is to check for suspicious changes in user behavior. This paper describes the automatic design of user profiling methods for the purpose of fraud detection, using a series of data mining techniques. Specifically, we use a rule-learning program to uncover indicators of fraudulent behavior from a large database of customer transactions. Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies. Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate high-confidence alarms. The system has been applied to the problem of detecting cellular cloning fraud based on a database of call records. Experiments indicate that this automatic approach performs better than hand-crafted methods for detecting fraud. Furthermore, this approach can adapt to the changing conditions typical of fraud detection environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  • Aronis, J. and Provost, F. 1997. Increasing the efficiency of data mining algorithms with breadth-first marker propagation. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 119–122.

  • Aronis, J., Provost, F., and Buchanan, B. 1996. Exploiting background knowledge in automated discovery. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), AAAI Press, pp. 355–358.

  • Buchanan, B.G. and Mitchell, T.M. 1978. Model-directed learning of production rules. In Pattern-Directed Inference Systems, F. Hayes-Roth (Ed.), New York: Academic Press, pp. 297–312.

    Google Scholar 

  • Chatfield, C. 1984. The Analysis of Time Series: An Introduction (third edition). New York: Chapman and Hall.

    Google Scholar 

  • Clearwater, S. and Provost, F. 1990. RL4: A tool for knowledge-based induction. In Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence, IEEE CS Press, pp. 24–30.

  • Davis, A. and Goyal, S. 1993. Management of cellular fraud: Knowledge-based detection, classification and prevention. In Thirteenth International Conference on Artificial Intelligence, Expert Systems and Natural Language, Avignon, France, vol. 2, pp. 155–164.

    Google Scholar 

  • DeMaria, R. and Gidari, A. 1996. Uncovering unsavory customers. Cellular Business, 24–30.

  • Džeroski, S. 1996. Inductive logic programming and knowledge discovery in databases. Advances in Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, 117–152.

    Google Scholar 

  • Ezawa, K. and Norton, S. 1995. Knowledge discovery in telecommunication services data using bayesian network models. In Proceedings of First International Conference on Knowledge Discovery and Data Mining, U. Fayyad and R. Uthurusamy (Eds.), Menlo Park, CA: AAAI Press, pp. 100–105.

    Google Scholar 

  • Ezawa, K. and Norton, S. 1996. Constructing Bayesian networks to predict uncollectible telecommunications accounts. IEEE Expert, 45–51.

  • Farnum, N. and Stanton, L. 1989. Quantitative Forecasting Methods. Boston, MA: PWS-Kent Publishing Company.

    Google Scholar 

  • Fawcett, T. and Provost, F. 1996. Combining data mining and machine learning for effective user profiling. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), E. Simoudis, J. Han, and U. Fayyad (Eds.), Menlo Park, CA: AAAI Press, pp. 8–13.

    Google Scholar 

  • Frank, J. 1994. Machine learning and intrustion detection: Current and future directions. In National Computer Security Conference, vol. 1, pp. 22–33. Available via http://seclab.cs.ucdavis.edu/papers/ ncsc.94.ps.

    Google Scholar 

  • Herzog, J. 1995. Beware of hurricane clone. Newaves. Available from http://www.pcia.com/1195l.htm.

  • Kittler, J. 1986. Feature selection and extraction. In Handbook of Pattern Recognition and Image Processing, K.S. Fu (Ed.), New York: Academic Press, pp. 59–83.

    Google Scholar 

  • Kumar, S. 1995. A Pattern Matching Approach to Misuse Intrusion Detection. Ph.D. thesis, Purdue University, Department of Computer Sciences. Available via ftp://coast.cs.purdue.edu/pub/COAST/kumarphd-intdet.ps.gz.

  • Nilsson, N.J. 1965. Learning Machines. New York: McGraw-Hill.

    Google Scholar 

  • Provost, F. and Aronis, J. 1996. Scaling up inductive learning with massive parallelism. Machine Learning, 23:33–46.

    Google Scholar 

  • Provost, F. and Fawcett, T. 1997. Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 43–48.

  • Quinlan, J. 1990. Learning logical definitions from relations. Machine Learning, 5:239–266.

    Google Scholar 

  • Quinlan, J.R. 1987. Generating production rules from decision trees. In Proceedings of the Tenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 304–307.

  • Rabiner, L.R. and Juang, B.H. 1986. An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1):4–16.

    Google Scholar 

  • Redden, M. 1996. A technical search for solutions. Cellular Business, 84–87.

  • Segal, R. and Etzioni, O. 1994. Learning decision lists using homogeneous rules. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 619–625.

    Google Scholar 

  • Smyth, P. 1994. Hidden Markov models for fault detection in dynamic systems. Pattern Recognition, 27(1):149–164.

    Google Scholar 

  • Steward, S. 1997. Lighting the way in '97. Cellular Business, 23.

  • Stolfo, S., Prodromidis, A., Tselepsis, S., Lee,W., Fan, D., and Chan, P. 1997. JAM: Java agents for meta-learning over distributed databases. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, pp. 74–81.

  • Sundaram, A. 1996. An introduction to intrusion detection. ACM Crossroads-Special Issue on Computer Security, 2(4). Available from http://www.acm.org/crossroads/xrds2-4/intrus.html.

  • Walters, D. and Wilkinson, W. 1994. Wireless fraud, now and in the future: A view of the problem and some solutions. Mobile Phone News, 4–7.

  • Webb, G. 1995. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:383–417.

    Google Scholar 

  • Young, P. 1984. Recursive Estimation and Time-Series Analysis. New York: Springer-Verlag.

    Google Scholar 

  • Yuhas, B.P. 1993. Toll-fraud detection. In Proceedings of the International Workshop on Applications of Neural Networks to Telecommunications, J. Alspector, R. Goodman, and T. Brown (Eds.), Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 239–244.

    Google Scholar 

  • Yuhas, B.P. 1995. Learning the structure of telecommunications fraud. Technical report, Bellcore.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fawcett, T., Provost, F. Adaptive Fraud Detection. Data Mining and Knowledge Discovery 1, 291–316 (1997). https://doi.org/10.1023/A:1009700419189

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009700419189

Navigation