Finding Fraud in Health Insurance Data with Two-Layer Outlier Detection Approach

  • Rob M. Konijn
  • Wojtek Kowalczyk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6862)

Abstract

Conventional techniques for detecting outliers address the problem of finding isolated observations that significantly differ from other observations that are stored in a database. For example, in the context of health insurance, one might be interested in finding unusual claims concerning prescribed medicines. Each claim record may contain information on the prescribed drug (its code), volume (e.g., the number of pills and their weight), dosing and the price. Finding outliers in such data can be used for identifying fraud. However, when searching for fraud, it is more important to analyse data not on the level of single records, but on the level of single patients, pharmacies or GP’s.

In this paper we present a novel approach for finding outliers in such hierarchical data. Our method uses standard techniques for measuring outlierness of single records and then aggregates these measurements to detect outliers in entities that are higher in the hierarchy. We applied this method to a set of about 40 million records from a health insurance company to identify suspicious pharmacies.

Keywords

Outlier Detection Single Record Claim Amount Local Outlier Factor Single Outlier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agyemang, A., Barker: A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis 10(6/2006), 521–538 (2005)Google Scholar
  2. 2.
    Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 43–78. Springer, Heidelberg (2002)Google Scholar
  3. 3.
    Bain, Engelhardt: Introduction to Probability and Mathematical Statistics. Duxbury Press, Boston (1992)Google Scholar
  4. 4.
    Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley and Sons, Chichester (1994)MATHGoogle Scholar
  5. 5.
    Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)CrossRefGoogle Scholar
  6. 6.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41, 15:1–15:58 (2009)CrossRefGoogle Scholar
  7. 7.
    Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)CrossRefMATHGoogle Scholar
  8. 8.
    Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Unknown, pp. 392–403 (1998)Google Scholar
  9. 9.
    Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Loop: local outlier probabilities. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 1649–1652. ACM, New York (2009)Google Scholar
  10. 10.
    Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: Loci: Fast outlier detection using the local correlation integral. In: International Conference on Data Engineering, p. 315 (2003)Google Scholar
  11. 11.
    Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. SIGMOD Rec. 29, 427–438 (2000)CrossRefGoogle Scholar
  12. 12.
    Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)CrossRefGoogle Scholar
  13. 13.
    Tang, J., Chen, Z., Chee Fu, A.W., Cheung, D.: A robust outlier detection scheme for large data sets. In: 6th Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 6–8 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rob M. Konijn
    • 1
  • Wojtek Kowalczyk
    • 1
  1. 1.Department of Computer ScienceVU University AmsterdamThe Netherlands

Personalised recommendations