Design and development of big data-based model for detecting fraud in healthcare insurance industry

Mary, A. Jenita; Claret, S. P. Angelin

doi:10.1007/s00500-023-08296-5

Design and development of big data-based model for detecting fraud in healthcare insurance industry

Application of soft computing
Published: 08 May 2023

Volume 27, pages 8357–8369, (2023)
Cite this article

Soft Computing Aims and scope Submit manuscript

A. Jenita Mary¹ &
S. P. Angelin Claret¹

481 Accesses
1 Citation
Explore all metrics

Abstract

The advancement in healthcare services has been increasing widely to extend several services with intense quality. One of the important issues affecting the effective use of public funds is the detection of health insurance fraud. Previous techniques of detecting fraud pay close attention to characteristics of a single visit rather than many patient visits. Due to a higher false positive rate and poor profile construction, the common traits have reduced detection performance. This paper introduces a novel and intelligent Provider Fraud_Anomaly Detection System (PF_ADS) by combining big data and deep learning approaches for the healthcare insurance industry. The proposed framework contributes to improvising the preprocessing and classification phases to detect provider fraud at an untimely phase. Initially, the collected datasets are preprocessed using a Relative Risk-based MapReduce framework that builds an organized set of relationships between diseases, patients, and claiming variables. The classification phase is improvised using a proposed Recurrent Neural Network (RNN). It consists of sophisticated steps to consider the significant attributes using hyperparameter optimization. Recalling ability is one of the best parts of RNNs that defines the past and present states of the networks. Therefore, the ability of network state predictions and the tuning of parameters is studied by improved Decisional Score-based Bayesian Optimization (DS_BO). Finally, the best attributes with the selective hyperparameters are fed into the input layer of the Recurrent Neural Networks (RNNs) to classify the anomalies from the provider’s end. The proposed PF_ADS framework is experimented with and validated on the public repositories. The experimental results state that the proposed framework outperforms better than the other methods in terms of accuracy (88.09%), precision (14.15%), recall (32.80%), and 92.30 s computational time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Anti-fraud Framework for Medical Insurance Based on Deep Learning

Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data

Models for Detecting Frauds in Medical Insurance

Data availability

Enquiries about data availability should be directed to the authors.

References

Ashtiani MN, Raahemi B (2022) Intelligent fraud detection in financial statements using machine learning and data mining: a systematic literature review. Inst Electr Electron Eng Access 10:72504–72525
Google Scholar
Bauder RA, Khoshgoftaar TM, Seliya N (2017) A survey on the state of healthcare upcoding fraud analysis and detection. Health Serv Outcomes Res Method 17:31–55
Article Google Scholar
Bauder RA, Khoshgoftaar TM (2016) A probabilistic programming approach for outlier detection in healthcare claims. In: 2016 15th ieee international conference on machine learning and applications (ICMLA), Anaheim, CA, USA, pp 347–354
Bayerstadler A, Dijk LV, Winter F (2016) Bayesian multinomial latent variable modeling for fraud and abuse detection in health insurance. Insur Math Econ 71:244–252
Branting K, Reeder F, Gold J, Champney T (2016) Graph analytics for healthcare fraud risk estimation. In: IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Davis California, San Francisco, CA, USA, pp 845–851
Chandola V, Sukumar VR, Schryver JC (2013) Knowledge discovery from massive healthcare claims data. In: Proceedings of 19th ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, Illinois, USA, pp 1312–1320
Chelladurai U, Pandian S (2021) A novel blockchain based electronic health record automation system for healthcare. J Ambient Intell Humaniz Comput 13:693–703
Article Google Scholar
Gupta A, Anand R Medical Provider Fraud Detection, Dataset, Kaggle. Available: https://www.kaggle.com/rohitrox/medical-provider-fraud-detection
Hancock JT, Khoshgoftaar TM (2021) Gradient boosted decision tree algorithms for medicare fraud detection. SN Comput Sci 2(268):1–12
Google Scholar
Haque ME, Tozal ME (2022) Identifying health insurance claim frauds using mixture of clinical concepts. IEEE Trans Serv Comput 15(4):2356–2367
Article Google Scholar
He H, Hawkins S, Graco WJ, Yao X (2000) Application of genetic algorithm and k-nearest neighbour method in real world medical fraud detection problem. J Adv Comput Intell Inf 4(1):130–137
Article Google Scholar
Johnson JM, Khoshgoftaar TM (2019) Medicare fraud detection using neural networks. J Big Data 6:1–35
Article Google Scholar
Johnson JM, Khoshgoftaar TM (2021) Medical provider embeddings for healthcare fraud detection. SN Comput Sci 2(276):1–15
Google Scholar
Johnson ME, Nagarur N (2015) Multi-stage methodology to detect health insurance claim fraud. Health Care Manag Sci 19(3):249–260
Article Google Scholar
Kose I, Gokturk M, Kilic K (2015) An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Appl Soft Comput 36:283–299
Article Google Scholar
Li J, Huang KY, Shi J (2008) A survey on statistical methods for health care fraud detection. Health Care Manag Sci 11(3):275–287
Article Google Scholar
Marr B (2020) How big data is changing healthcare, Forbes, 2020. https://www.forbes.com/sites/bernardmarr/2015/04/21/how-big-data-is-changing-healthcare
Mary AJ, Claret SPA (2023) MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry. Int J Electr Comput Eng 13(1):756–769
Matloob I, Khan SA, Rahman HU (2020) Sequence mining and prediction-based healthcare fraud detection methodology. Inst Electr Electron Eng Access 8:143256–143273
Google Scholar
Ngufor C, Wojtusiak J (2013) Unsupervised labeling of data for supervised learning and its application to medical claims prediction. Comput Sci 14(2):191–214
Article Google Scholar
Ozbayoglu AM, Gudelek MU, Sezer OB (2020) Deep learning for financial applications: a survey. Appl Soft Comput 93:10638
Article Google Scholar
Sekharan GH, Dora P (2015) Healthcare insurance fraud detection leveraging big data analytics. Int J Sci Res 4(4):2073–2076
Google Scholar
Settipalli L, Gangadharan GR (2023) WMTDBC: an unsupervised multivariate analysis model for fraud detection in health insurance claims. Expert Syst Appl 215
Settipalli L, Gangadharan GR (2021) Provider profiling and labelling of fraudulent health insurance claims using Weighted MultiTree. J Ambient Intell Humaniz Comput 73(6):1–22
Google Scholar
Shin H, Park H, Lee J, Jhee WC (2012) A Scoring model to detect abusive billing patterns in health insurance claims. Expert Syst Appl 39(8):7441–7450
Simborg DW (2008) Healthcare fraud: Whose problem is it anyway? J Am Med Inform Assoc 15(3):278–280
Article Google Scholar
Van Capelleveen GC, Poel M, Mueller R, Thornton D, van Hillegersberg J (2016b) Outlier detection in healthcare fraud: a case study in the medicaid dental domain. Int J Acc Inf Syst 21:18–31
Article Google Scholar
van Capelleveen GC, Poel M, Mueller R, Thornton D, van Hillegersberg J (2016a) Outlier detection in healthcare fraud: a case study in the medicaid dental domain. Int J Acc Inf Syst 21(1):18–31
Vosseler A (2022) Unsupervised insurance fraud prediction based on anomaly detector ensembles. Risks 10(132)
Warneke D, Kao O (2009) Nephele: efficient parallel data processing in the cloud. In: Proceedings of the 2nd workshop on many-task computing on grids and supercomputers, New York, NY, USA, pp 1–10
Wu X, Zhu X, Wu G, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):93–109
Google Scholar
Yamanishi K, Takeuchi J, Williams GJ, Milne P (2000) On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Min Knowl Disc 8(1):275–300
MathSciNet Google Scholar
Yang W, Hwang S (2006) A process-mining framework for the detection of healthcare fraud and abuse. Expert Syst Appl 31:56–68
Article Google Scholar
Zhou S, He J, Yang H, Chen D, Zhang R (2020) Big data-driven abnormal behavior detection in healthcare based on association rules. Inst Electr Electron Eng Access 8:129002–129011
Google Scholar

Download references

Acknowledgements

The authors are grateful to all who supported us in producing this article and to those who contributed to this study.

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Department of Computer Science, College of Science and Humanities, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India
A. Jenita Mary & S. P. Angelin Claret

Authors

A. Jenita Mary
View author publications
You can also search for this author in PubMed Google Scholar
S. P. Angelin Claret
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. P. Angelin Claret.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mary, A.J., Claret, S.P.A. Design and development of big data-based model for detecting fraud in healthcare insurance industry. Soft Comput 27, 8357–8369 (2023). https://doi.org/10.1007/s00500-023-08296-5

Download citation

Accepted: 09 March 2023
Published: 08 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00500-023-08296-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design and development of big data-based model for detecting fraud in healthcare insurance industry

Abstract

Access this article

Similar content being viewed by others

An Anti-fraud Framework for Medical Insurance Based on Deep Learning

Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data

Models for Detecting Frauds in Medical Insurance

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Design and development of big data-based model for detecting fraud in healthcare insurance industry

Abstract

Access this article

Similar content being viewed by others

An Anti-fraud Framework for Medical Insurance Based on Deep Learning

Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data

Models for Detecting Frauds in Medical Insurance

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation