Skip to main content

Boosting Naive Bayes for Claim Fraud Diagnosis

Part of the Lecture Notes in Computer Science book series (LNCS,volume 2454)

Abstract

In this paper we apply the weight of evidence reformulation of AdaBoosted naive Bayes scoring due to Ridgeway et al. (1998) for the diagnosis of insurance claim fraud. The method effiectively combines the advantages of boosting and the modelling power and representational attractiveness of the probabilistic weight of evidence scoring framework. We present the results of an experimental comparison with an emphasis on both discriminatory power and calibration of probability estimates. The data on which we evaluate the method consists of a representative set of closed personal injury protection automobile insurance claims from accidents that occurred in Massachusetts during 1993. The findings of the study reveal the method to be a valuable contribution to the design of effective, intelligible, accountable and efficient fraud detection support.

Keywords

  • Receiver Operating Characteristic Curve
  • Probability Estimate
  • White Collar Crime
  • Fraud Detection
  • Insurance Fraud

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Viaene, S., Derrig, R., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance fraud detection. Journal of Risk and Insurance (2002) to appear

    Google Scholar 

  2. Ridgeway, G., Madigan, D., Richardson, T., O'Kane, J.: Interpretable boosted naive Bayes classification. In: Fourth International Conference on Knowledge Discovery and Data Mining, New York City (1998)

    Google Scholar 

  3. Weisberg, H., Derrig, R.: Identification and investigation of suspicious claims. AIB Cost Containment/Fraud Filing DOI Docket R95-12, AIB Massachusetts (1995) http://www.ifb.org/ifrr/ifrr170.pdf

  4. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29 (1997) 131–163

    CrossRef  MATH  Google Scholar 

  5. Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: Ninth European Conference on Machine Learning, Prague (1997)

    Google Scholar 

  6. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29 (1997) 103–130

    CrossRef  MATH  Google Scholar 

  7. Freund, Y., Shapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Second European Conference on Computational Learning Theory, Barcelona (1995)

    Google Scholar 

  8. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning 36 (1999) 105–139

    CrossRef  Google Scholar 

  9. Shapire, R., Freund, Y., Bartlett, P., Lee, W.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26 (1998) 1651–1686

    CrossRef  MathSciNet  Google Scholar 

  10. Elkan, C.: Boosting and naive Bayesian learning. Technical Report CS97-557, Department of Computer Science and Engineering, University of California, San Diego (1997)

    Google Scholar 

  11. O'Kane, J., Ridgeway, G., Madigan, D.: Statistical analysis of clinical variables to predict the outcome of surgical intervention in patients with knee complaints. Statistics in Medicine (1998) submitted

    Google Scholar 

  12. Good, I.: The estimation of probabilities: An essay on modern Bayesian methods. MIT Press, Cambridge (1965)

    MATH  Google Scholar 

  13. Spiegelhalter, D., Knill-Jones, R.: Statistical and knowledge-based approaches to clinical decision-support systems, with an application in gastroenterology. Journal of the Royal Statistical Society. Series A (Statistics in Society) 147 (1884) 35–77

    Google Scholar 

  14. Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing classifiers. In: Fifteenth International Conference on Machine Learning, Madison (1998)

    Google Scholar 

  15. Hand, D.: Construction and assessment of classification rules. John Wiley & Sons (1997)

    Google Scholar 

  16. Hanley, J., McNeil, B.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143 (1982) 29–36

    Google Scholar 

  17. Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42 (2001) 203–231

    CrossRef  MATH  Google Scholar 

  18. Bishop, C.: Neural networks for pattern recognition. Oxford University Press (1995)

    Google Scholar 

  19. Titterington, D., Murray, G., Murray, L., Spiegelhalter, D., Skene, A., Habbema, J., Gelpke, G.: Comparison of discrimination techniques applied to a complex data set of head injured patients. Journal of the Royal Statistical Society. Series A (Statistics in Society) 144 (1981) 145–175

    MATH  MathSciNet  Google Scholar 

  20. Spiegelhalter, D.: Probabilistic prediction in patient management and clinical trials. Statistics in Medicine 5 (1986) 421–433

    CrossRef  Google Scholar 

  21. Copas, J.: Plotting p against x. Journal of the Royal Statistical Society. Series C (Applied Statistics) 32 (1983) 25–31

    Google Scholar 

  22. Bennett, P.: Assessing the calibration of naive Bayes’ posterior estimates. Technical Report CMU-CS-00-155, Computer Science Department, School of Computer Science, Carnegie Mellon University (2000)

    Google Scholar 

  23. Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unkown. In: Seventh ACM SIGKDD Conference on Knowledge Discovery in Data Mining, San Francisco (2001)

    Google Scholar 

  24. Ridgeway, G., Madigan, D., Richardson, T.: Boosting methodology for regression problems. In: Seventh International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Viaene, S., Derrig, R., Dedene, G. (2002). Boosting Naive Bayes for Claim Fraud Diagnosis. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-46145-0_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44123-6

  • Online ISBN: 978-3-540-46145-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics