Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree

Settipalli, Lavanya; Gangadharan, G. R.

doi:10.1007/s12652-021-03481-6

Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree

Original Research
Published: 15 September 2021

Volume 14, pages 3487–3508, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

504 Accesses
1 Citation
Explore all metrics

Abstract

Recently, healthcare organizations getting engross in digitizing the health insurance system. Besides its undeniable benefits, the risk of exaggerating a claim or entirely fabricating one by providers is increasing. Provider profiling aids in outlier false claims by measure the performance of providers and outcomes of healthcare. Hence provider profiling has become an interesting research topic in the health insurance system. However, most of the existing provider profiling approaches are encountering the problem of intermediate results due to class overlappings. Another problem encounter in developing or validating an automated fraud detection model is the availability of labeled data. The manual labeling of huge claims data by medical experts is always not feasible. Hence, it is essential to automate the process of fraud detection which was not focused on by the researchers who are developing healthcare fraud detection models. There is one existing approach to automate the labeling of health insurance claims which considers the provider’s unique identification number as a reference while one-to-one mapping with real-world fraudulent claims. However, the approach is encountering a problem of missing values in providers’ identification numbers, causing poor performance in healthcare fraud detection models. In this study, we have proposed a Weighted MultiTree approach to mitigate the aforementioned problems of provider profiling and labeling. MultiTree is a DAG construction in which each node is reachable from any other node without ambiguity. And hence our proposed approach performed provider profiling without intermediate results with less construction cost. And the labeling of claims using unique details set of providers yielded from MultiTree enhanced the detection accuracy of fraudulent claims.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-stage methodology to detect health insurance claim fraud

Article 20 January 2015

Identifying Physician Fraud in Healthcare with Open Data

A survey on the state of healthcare upcoding fraud analysis and detection

Article 28 July 2016

Availability of data and material

(data transparency) Data used for this paper is available in the public domain.

Code availability

(software application or custom code) Custom code developed.

References

ACA (2021) The Affordable Care Act and Health Care Fraud. https://weaver.com/blog/affordable-care-act-and-health-care-fraud Accessed 20 Nov 2020
Ashtiani MN, Raahemi B (2021) Intelligent fraud detection in financial statements using machine learning and data mining: a systematic literature review. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3096799
Article Google Scholar
Bauder RA, Khoshgoftaar TM (2016) A probabilistic programming approach for outlier detection in healthcare claims. In: Proceedings of the 15th IEEE international conference on machine learning and applications, pp 347–354
Bauder RA, Khoshgoftaar TM (2017) Multivariate outlier detection in medicare claims payments applying probabilistic programming methods. J Health Serv Outcomes Res Methodol 17:1–34
Google Scholar
Bauder RA, Khoshgoftaar TM, Seliya N (2017) A survey on the state of healthcare upcoding fraud analysis and detection. J Health Serv Outcomes Res Methodol 17:31–55
Article Google Scholar
Bayerstadler A, Dijk LV, Winter F (2016) Bayesian multinomial latent variable modeling for fraud and abuse detection in health insurance. Insur Math Econ 71:244–252
Article MathSciNet MATH Google Scholar
Bekkar M, Djemaa HK, Alitouche TA (2013) Evaluation measures for models assessment over imbalanced data sets. J Inf Eng Appl 3:10
Google Scholar
Boutaher N, Elomri A, Abghour N et al (2020) A review of credit card fraud detection using machine learning techniques. In: 5th international conference on cloud computing and artificial intelligence: technologies and applications (CloudTech), pp 1–5
Branting LK, Reeder F, Gold J et al (2016) Graph analytics for healthcare fraud risk estimation. In: Proceedings of the IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 845–851
Capelleveen GV, Poel M, Roland MM et al (2016) Outlier detection in healthcare fraud: a case study in the medicaid dental domain. Int J Account Inf Syst 21:18–31
Article Google Scholar
Chandola V, Sukumar SR, Schryver JC (2013) Knowledge discovery from massive healthcare claims data. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1312–1320
Chelladurai U, Pandian S (2021) A novel blockchain based electronic health record automation system for healthcare. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-021-03163-3
Article Google Scholar
CMS (2019) Medicare Physician & Other Practitioners—by Provider and Service https://data.cms.gov/provider-summary-by-type-of-service/medicare-physician-other-practitioners/medicare-physician-other-practitioners-by-provider-and-service. Accessed 10 Nov 2020
Dhieb N, Ghazzai H, Besbes H et al (2020) A secure AI-driven architecture for automated insurance systems: fraud detection and risk measurement. IEEE Access 8:58546–58558
Article Google Scholar
Hancock JT, Khoshgoftaar TM (2021) Gradient boosted decision tree algorithms for medicare fraud detection. SN Comput Sci 2(268):1–12
Google Scholar
Haque ME, Tozal ME (2021) Identifying health insurance claim frauds using mixture of clinical concepts. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2021.3051165
Article Google Scholar
Hasselgren A, Kralevska K, Gligoroski D et al (2020) Blockchain in healthcare and health sciences—a scoping review. Int J Med Informatics 134(104040):1–10
Google Scholar
HCFG (2021) Challenge of Health Care Fraud. https://healthcarefraudgroup.com/the-challenges-of-health-care-fraud/. Accessed 12 July 2021
HCPCS (2019) Centers for Medicare & Medicaid Services, HCPCS general information. https://www.cms.gov/Medicare/Coding/MedHCPCSGenInfo/index.html. Accessed 20 Jan 2019
He H, Wang J, Graco W et al (1997) Application of neural networks to detection of medical fraud. Expert Syst Appl 13(4):329–336
Article Google Scholar
He H, Hawkins S, Graco W et al (2000) Application of genetic algorithms and k-nearest neighbor method in real world medical fraud detection problem. J Adv Comput Intell Intell Inf 4(2):130–137
Article Google Scholar
Herland M, Khoshgoftaar TM, Bauder RA (2018) Big data fraud detection using multiple medicare data sources. J Big Data 5(29):1–21
Google Scholar
Jeni LA, Cohn JF, De La Torre F (2013) Facing imbalanced data–recommendations for the use of performance metrics. In: 2013 Humaine association conference on affective computing and intelligent interaction (ACII), pp 245–251
Jiang Z, Chen X, Dong B et al (2020) Trajectory-based community detection. IEEE Trans Circuits Syst II Express Briefs 67(6):1139–1143
Google Scholar
Johnson JM, Khoshgoftaar TM (2019) Medicare fraud detection using neural networks. J Big Data 6(63):1–35
Google Scholar
Johnson JM, Khoshgoftaar TM (2021) Medical provider embeddings for healthcare fraud detection. SN Comput Sci 2(276):1–15
Google Scholar
Johnson ME, Nagarur N (2015) Multi-stage methodology to detect health insurance claim fraud. Health Care Manag Sci. https://doi.org/10.1007/s10729-015-9317-3
Article Google Scholar
Kosea I, Gokturk M, Kilic K (2015) An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Appl Soft Comput 36:283–299
Article Google Scholar
Li J, Huang KY, Jin J et al (2008) A survey on statistical methods for health care fraud detection. Health Care Manag Sci 11:275–287
Article Google Scholar
Lucas Y, Portier P-E, Laporte L et al (2020) Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs. Futur Gener Comput Syst 102:393–402
Article Google Scholar
Marr B (2015) How big data is changing healthcare. https://www.forbes.com/sites/bernardmarr/2015/04/21/how-big-data-is-changing-healthcare. Accessed 18 June 2020
Matloob I, Khan SA, Rahman HU (2020) Sequence mining and prediction-based healthcare fraud detection methodology. IEEE Access 8:143256–143273
Article Google Scholar
Matthews (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica Et Biophysica Acta (BBA)-Protein Structure 405(2):442–451
Article Google Scholar
McGhin T, Choo K-K, Liu CZ et al (2019) Blockchain in healthcare applications: research challenges and opportunities. J Netw Comput Appl 135:62–75
Article Google Scholar
NHCAA (2010) Combating Health Care Fraud in a Post-Reform World: Seven Guiding Principles for Policymakers. https://www.pcmanet.org/wp-content/uploads/2016/08/pr-dated-05-09-13-whitepaper_oct10.pdf. Accessesed 11 Mar 2020
NPI (2019) Centers for Medicare & Medicaid Services, National Provider Identifier (NPI) standard. https://www.cms.gov/Regulations-and-Guidance/Administrative-Simplification/NationalProvIdentStand/. Accessed 11 Mar 2019
OIG (2019) LEIE downloadable databases https://oig.hhs.gov/exclusions/exclusions_list.asp. Accessed 10 Nov 2019
Ozbayoglu AM, Gudelek MU, Sezer OB (2020) Deep learning for financial applications: a survey. Appl Soft Comput 93(106384):1–29
Google Scholar
Sahmoud S, Topcuoglu HR (2020) A general framework based on dynamic multi-objective evolutionary algorithms for handling feature drifts on data streams. Futur Gener Comput Syst 102:42–52
Article Google Scholar
San Miguel Carrasco R, Sicilia-Urbán MÁ (2020) Evaluation of deep neural networks for reduction of credit card fraud alerts. IEEE Access 8:186421–186432
Article Google Scholar
Sasaki Y (2007) The truth of the F-measure. Teach Tutor mater
Shanmugapriya E, Kavitha R (2019) Medical big data analysis: preserving security and privacy with hybrid cloud technology. Soft Comput 23:2585–2596
Article Google Scholar
Shin H, Park H, Lee J et al (2012) A Scoring model to detect abusive billing patterns in health insurance claims. Expert Syst Appl 39(8):7441–7450
Article Google Scholar
Simborg DW (2008) Healthcare fraud: whose problem is it anyway? J Am Med Inform Assoc 15(3):278–280
Article Google Scholar
Viveros MS, Nearhos JP, Rothman MJ (1996) Applying data mining techniques to a health insurance information system. In: Proceedings of the 22nd conference on very large data bases (VLDB), pp 286–294
Yamanishi K, Takeuchi JI, Williams G, Milne P (2004) On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min Knowl Disc 8(3):275–300
Article MathSciNet Google Scholar
Yang WS, Hwang SY (2006) A process-mining framework for the detection of healthcare fraud and abuse. Expert Syst Appl 31:56–68
Article Google Scholar
Zhang Z, Chen L, Liu Q et al (2020) A fraud detection method for low-frequency transaction. IEEE Access 8:25210–25220
Article Google Scholar
Zhou S, He J, Yang H et al (2020) Big data-driven abnormal behavior detection in healthcare based on association rules. IEEE Access 8:129002–129011
Article Google Scholar

Download references

Funding

This work is partially supported by the Scheme for Promotion of Academic and Research Collaboration (SPARC), sponsored by the Ministry of Human Resource Development, Government of India, under the project titled Digital Health Records Storage and Analysis for Healthcare Provisioning of Global Patients: An India-Australia Initiative (1406).

Author information

Authors and Affiliations

National Institute of Technology, Tiruchirappalli, India
Lavanya Settipalli & G. R. Gangadharan

Authors

Lavanya Settipalli
View author publications
You can also search for this author in PubMed Google Scholar
G. R. Gangadharan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. R. Gangadharan.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Settipalli, L., Gangadharan, G.R. Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree. J Ambient Intell Human Comput 14, 3487–3508 (2023). https://doi.org/10.1007/s12652-021-03481-6

Download citation

Received: 09 December 2020
Accepted: 02 September 2021
Published: 15 September 2021
Issue Date: April 2023
DOI: https://doi.org/10.1007/s12652-021-03481-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree

Abstract

Access this article

Similar content being viewed by others

Multi-stage methodology to detect health insurance claim fraud

Identifying Physician Fraud in Healthcare with Open Data

A survey on the state of healthcare upcoding fraud analysis and detection

Availability of data and material

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree

Abstract

Access this article

Similar content being viewed by others

Multi-stage methodology to detect health insurance claim fraud

Identifying Physician Fraud in Healthcare with Open Data

A survey on the state of healthcare upcoding fraud analysis and detection

Availability of data and material

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation