Abstract
Hospitals and health care providers tend to get involved in exaggerated and fraudulent medical claims initiated by national insurance schemes. The present study applies data mining techniques to detect fraudulent or abusive reporting by healthcare providers using their invoices for diabetic outpatient services. This research is pursued in the context of Taiwan’s National Health Insurance system. We compare the identification accuracy of three algorithms: logistic regression, neural network, and classification trees. While all three are quite accurate, the classification tree model performs the best with an overall correct identification rate of 99%. It is followed by the neural network (96%) and the logistic regression model (92%).
Similar content being viewed by others
References
Lassey M, Lassey W, Jinks M (1997) Health care systems around the world: characteristics, issues, reforms. Prentice-Hall, NJ
Hong S, Weiss S (2001) Advances in predictive models for data mining. Pattern Recog Lett 22(1):55–61
Bolton R, Hand D (2002) Statistical fraud detection: a review. Statist Sci 17(3):235–249
Koh H, Tan G (2005) Data mining applications in healthcare. J Healthc Inf Manag 19(2):64–72
Ratner R (1998) Type 2 Diabetes Mellitus: the grand overview. Diabetic Med 15(S4):S4–S7
Guisseppi F, Gangopadhyay A, Adya M (2000) Intelligent data mining system to detect healthcare fraud. In: Armoni A (ed) Healthcare information systems: challenges of the new millennium. Hershey, Idea Group Publishing, PA
National Health Insurance Bureau (2000–2004) National Health Insurance Statistics
Bureau of National Health Insurance (2004) Report on quality of medicare for diabetes mellitus under National Health Insurance
Sparrow M (1996) License to steal. Westview Press, Boulder, CO
Long J, Irani E, Slagle J (1991) Automating the discovery of causal relationships in a medical records database. In: Piatestsky-Shapiro G, Frawley W (eds) Knowledge discovery in database. AAAI Press, Menlo Park, CA
Milley A (2000) Healthcare and data mining. Health Manag Technol 21(8):44–47
Koh H, Gerald T (2005) Data mining applications in healthcare. J Healthc Inf Manag 19(2):54–72
Lim T, Loh W, Shih Y (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40(3):203–229
Chae Y, Seung H, Kyoung W, Dong H (2001) Data mining approach to policy analysis in a health insurance domain. Med Inf 62(2):103–111
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, CA
Kincade K (1998) Data mining: digging for healthcare gold. Ins Technol 23:IM2–IM7
Yang W, Hwang S (2006) A process- mining framework for the detection of healthcare fraud and abuse. Expert Syst Appl 31(1):56–68
Chan C, Lan C (2001) A data mining technique combining fuzzy sets theory and bayesian classifier-an application of auditing the health insurance fee. Proceedings of the International Conference on Artificial Intelligence, IC-AI’2001, Las Vegas, USA: 402–408
Bloomgarden Z (2002) The epidemiology of complications. Diabetes Care 25(5):924–932
American Diabetes Association (2003) Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care 26(Suppl 1):S5–S20
American Diabetes Association (2004) Standards of medical care in diabetes. Diabetes Care 27(1):S15–S35
Health Insurance Association of America (1993) Update Source Book of Health Insurance Data
Nisbet R (2006) Data mining tools: which one is best for CRM? Part 3, BI Report, March (2006), available at http://www.dmreview.com/editorial/dmreview/print_action.cfm?articleId=1049954. Accessed on March 9, 2007
Young M, Seung H, Dyoung W, Dong H, Sun H (2001) Data mining approach to policy analysis in a health insurance domain. Int J Med Inf 62:103–111
Dasgupta C, Despensa G, Ghose S (1994) Comparing the predictive performance of a neural network model with some traditional market response models. Int J Forecast 10(2):235–244
Fish K, Barnes J, Aiken M (1995) Artificial neural networks: a new methodology for industrial market segmentation. Industrial Market Manag 24(5):431–438
Hruschka H (1993) Determining market response functions by neural network modeling: a comparison to econometric techniques. Eur J Oper Res 66(1):27–35
Cabena P, Hadjinian P, Stadler J, Zanasi A (1998) Discovering data mining from concept to implementation. Prentice Hall PTR, Upper Saddle River, NJ
McKee T, Lensberg T (2002) Genetic programming and rough sets: a hybrid approach to bankruptcy classification. Eur J Oper Res 138:436–451
Widrow B, Rumelhart D, Lehr M (1994) Neural networks: applications in industry, business and science. Commun ACM 37(3):93–105
Wilson R, Sharda R (1994) Bankruptcy prediction using neural networks. Decis Support Syst 11(5):545–557
Wu J (1994) Neural networks and simulation methods. Marcel Dekker Inc., NY
Haykin S (1994) Neural network: a comprehensive foundation. Prentice Hall PTR, Upper Saddle River, NJ
Acknowledgements
The authors thank the National Science Council of the Republic of China for financially supporting this research (NSC 95–2416-H-264 -010).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liou, FM., Tang, YC. & Chen, JY. Detecting hospital fraud and claim abuse through diabetic outpatient services. Health Care Manage Sci 11, 353–358 (2008). https://doi.org/10.1007/s10729-008-9054-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10729-008-9054-y