Skip to main content

Advertisement

Log in

Approaches for identifying U.S. medicare fraud in provider claims data

  • Published:
Health Care Management Science Aims and scope Submit manuscript

Abstract

Quality and affordable healthcare is an important aspect in people’s lives, particularly as they age. The rising elderly population in the United States (U.S.), with increasing number of chronic diseases, implies continuing healthcare later in life and the need for programs, such as U.S. Medicare, to help with associated medical expenses. Unfortunately, due to healthcare fraud, these programs are being adversely affected draining resources and reducing quality and accessibility of necessary healthcare services. The detection of fraud is critical in being able to identify and, subsequently, stop these perpetrators. The application of machine learning methods and data mining strategies can be leveraged to improve current fraud detection processes and reduce the resources needed to find and investigate possible fraudulent activities. In this paper, we employ an approach to predict a physician’s expected specialty based on the type and number of procedures performed. From this approach, we generate a baseline model, comparing Logistic Regression and Multinomial Naive Bayes, in order to test and assess several new approaches to improve the detection of U.S. Medicare Part B provider fraud. Our results indicate that our proposed improvement strategies (specialty grouping, class removal, and class isolation), applied to different medical specialties, have mixed results over the selected Logistic Regression baseline model’s fraud detection performance. Through our work, we demonstrate that improvements to current detection methods can be effective in identifying potential fraud.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.medicalbillingandcoding.org/common-problems-coding/

  2. https://www.insurancefraud.org/statistics.htm#13

  3. A tabular representation comparing the predicted class membership against the actual class membership for each instance present in the dataset, denoting true positives, true negatives, false positives, and false negatives.

  4. Separate Office or Facility or Combined Office or Facility.

  5. Original four classes are from our previous work and based on unique procedures that have both a high number of instances and poor classification performance, and Chosen classes removes the original four classes plus twelve additional specialties.

  6. A model built by adjusting the costs associated with Type I and Type II errors.

References

  1. HHS.gov. Hhs report: Average health insurance premiums doubled since 2013. [Online]. Available: https://www.hhs.gov/about/news/2017/05/23/hhs-report-average-health-insurance-premiums-doubled-2013.html

  2. Forbes. Healthcare – 5, 10, 20 years in the past and future. [Online]. Available: https://www.forbes.com/sites/singularity/2012/07/02/healthcare-5-10-20-years-in-the-past-and-future/#4d2c89b4310b

  3. U.S. Department of Health and Human Services. Health, United State, 2016. [Online]. Available: https://www.cdc.gov/nchs/data/hus/hus16.pdf

  4. Feldstein M (2006) Balancing the goals of health care provision and financing. Health Aff 25(6):1603–1611

    Article  Google Scholar 

  5. U.S. Centers for Medicare & Medicaid Services (2018) What’s Medicare. [Online]. Available: https://www.medicare.gov/sign-up-change-plans/decide-how-to-get-medicare/whats-medicare/what-is-medicare.html

  6. U.S. Centers for Medicare & Medicaid Services (2017) Other entities frequently asked questions. [Online]. Available: https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/sharedsavingsprogram/Downloads/other-entities-faqs.pdf

  7. U.S. Centers for Medicare & Medicaid Services. (2012) Medicare claim submission guidelines fact sheet. [Online]. Available: http://www.nacns.org/wp-content/uploads/2016/11/CMS_ReimbursementClaim.pdf

  8. Joudaki H, Rashidian A, Minaei-Bidgoli B, Mahmoodi M, Geraili B, Nasiri M, Arab M (2016) Improving fraud and abuse detection in general physician claims: a data mining study. Int J Health Policy Manag 5(3):165

    Article  Google Scholar 

  9. Pawar MP (2016) Review on data mining techniques for fraud detection in health insurance. IJETT 3:2

    Google Scholar 

  10. Coalition Against Insurance Fraud. By the numbers: fraud statistics. [Online]. Available: http://www.insurancefraud.org/statistics.htm

  11. Lambert J, Dunstan R Civil recovery schemes: for or against? [Online]. Available: https://www.theguardian.com/law/2010/dec/07/civil-recovery-schemes-for-or-against

  12. Medicare Fraud Strike Force. Office of inspector general. [Online]. Available: https://www.oig.hhs.gov/fraud/strike-force/

  13. Ko JS, Chalfin H, Trock BJ, Feng Z, Humphreys E, Park S-W, Carter HB, Frick KD, Han M (2015) Variability in medicare utilization and payment among urologists. Urol 85(5):1045–1051

    Article  Google Scholar 

  14. Santa Clara, Oct 6, 2013, in Conjuction with the IEEE International Conference on BigData. Bigdata in bioinformatics and health care informatics. [Online]. Available: http://www.ittc.ku.edu/jhuan/BBH/

  15. Bauder RA, Khoshgoftaar TM, Richter A, Herland M (2016) Predicting medical provider specialties to detect anomalous insurance claims. In: 2016 IEEE 28th international conference on tools with artificial intelligence (ICTAI). IEEE, pp 784–790

  16. Herland M, Bauder RA, Khoshgoftaar TM (2017) Medical provider specialty predictions for the detection of anomalous medicare insurance claims. In: 2017 IEEE 18th international conference information reuse and integration (IRI). IEEE, pp 579–588

  17. Bauder RA, Khoshgoftaar TM (2018) A survey of medicare data processing and integration for fraud detection. In: 2018 IEEE 19th international conference on information reuse and integration (IRI). IEEE, pp 9–14

  18. CMS. Research, statistics, data, and systems. [Online]. Available: https://www.cms.gov/research-statistics-data-and-systems/research-statistics-data-and-systems.html

  19. CMS. Medicare provider utilization and payment data: Physician and other supplier. [Online]. Available: https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Physician-and-Other-Supplier.html

  20. LEIE. Office of inspector general leie downloadable databases. [Online]. Available: https://oig.hhs.gov/exclusions/authorities.asp

  21. Bauder RA, Khoshgoftaar TM, Seliya N (2017) A survey on the state of healthcare upcoding fraud analysis and detection. Health Serv Outcome Res Methodol 17(1):31–55

    Article  Google Scholar 

  22. Feldman K, Chawla NV (2015) Does medical school training relate to practice? Evidence from big data. Big Data 3(2):103–113

    Article  Google Scholar 

  23. Bauder RA, Khoshgoftaar TM (2017) Multivariate outlier detection in medicare claims payments applying probabilistic programming methods. Health Serv Outcome Res Methodol 17(3–4):256–289

    Article  Google Scholar 

  24. Sadiq S, Tao Y, Yan Y, Shyu M-L (2017) Mining anomalies in medicare big data using patient rule induction method. In: 2017 IEEE third international conference on multimedia big data (BigMM). IEEE, pp 185–192

  25. Chandola V, Sukumar SR, Schryver JC (2013) Knowledge discovery from massive healthcare claims data. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1312–1320

  26. Branting LK, Reeder F, Gold J, Champney T (2016) Graph analytics for healthcare fraud risk estimation. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 845–851

  27. CMS. National provider identifier standard (npi). [Online]. Available: https://www.cms.gov/Regulations-and-Guidance/Administrative-Simplification/NationalProvIdentStand/

  28. CMS. HCPCS - General Information. [Online]. Available: https://www.cms.gov/Medicare/Coding/MedHCPCSGenInfo/index.html?redirect=/medhcpcsgeninfo/

  29. OIG. Office of inspector general exclusion authorities. [Online]. Available: https://oig.hhs.gov/exclusions/index.asp

  30. Pande V, Maas W (2013) Physician medicare fraud: characteristics and consequences. Int J Pharm Healthc Mark 7(1):8–33

    Article  Google Scholar 

  31. The R Foundation. What is r? [Online]. Available: https://www.r-project.org/about.html

  32. Python Software Foundation. Python. [Online]. Available: https://www.python.org/

  33. Witten IH, Frank E, Hall M, Pal CJ (2016) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Mateo

    Google Scholar 

  34. Apache. Welcome to spark python api docs! [Online]. Available: http://spark.apache.org/docs/2.1.0/api/python/index.html

  35. Shanahan JG, Dai L (2015) Large scale distributed data science using apache spark, pp 2323–2324, 08

  36. Khoshgoftaar TM, Seiffert C, Van Hulse J, Napolitano A, Folleco A (2007) Learning with limited minority class data. In: 2007. 2007 sixth international conference on machine learning and applications, ICMLA 2007. IEEE, pp 348–353

  37. Van Hulse J, Khoshgoftaar TM, Napolitano A (2007) Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th international conference on machine learning. ACM, pp 935–942

  38. Weka. Costsensitiveclassifier. [Online]. Available: https://weka.wikispaces.com/CostSensitiveClassifier

  39. Department of Justice U.S. Attorney’s Office (2016) Federal jury convicts tinley park physician in medicare fraud scheme. [Online]. Available: https://www.justice.gov/usao-ndil/pr/federal-jury-convicts-tinley-park-physician-medicare-fraud-scheme

  40. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  41. Le Cessie S, Van Houwelingen JC (1992) Ridge estimators in logistic regression. Applied statistics 41 (1):191–201

    Article  Google Scholar 

  42. Malouf R (2002) A comparison of algorithms for maximum entropy parameter estimation. In: Proceedings of the 6th conference on natural language learning, vol 20. Association for Computational Linguistics, pp 1–7

  43. Platt J, Smola A (1998) Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf B, Burges C (eds) Advances in Kernel methods - support vector learning. MIT Press. [Online]. Available: http://research.microsoft.com/~jplatt/smo.html.

  44. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers and associate editor for their insightful evaluation and constructive feedback of this paper, as well as the members of the Data Mining and Machine Learning Laboratory, Florida Atlantic University, for their assistance in the review process. We acknowledge partial support by the NSF (CNS-1427536). Opinions, findings, conclusions, or recommendations in this paper are the authors’ and do not reflect the views of the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard A. Bauder.

Ethics declarations

Conflict of interests

Author 1 declares that he has no conflict of interest. Author 2 declares that he has no conflict of interest. Author 3 declares that he has no conflict of interest.

Ethical Approval:

This article does not contain any studies with human participants or animals performed by any of the authors.

Appendices

Appendix A

Multinomial Naive Bayes classifies new instances, in our case these instances consist of new Medicare claims per year for a particular provider, by finding the posterior probabilities of class membership based on each feature value, which is learned from a set of labeled training instances. The approximation is done using Bayes’ rule by assuming conditional independence. Conditional independence is the idea that each feature in the dataset is independent from one another which is rarely true in practice, however, the model is very effective and is used extensively in the field of data mining and machine learning [33]. Naive Bayes is used with both PySpark and Weka.

Logistic Regression predicts probabilities for which class a categorically distributed dependent variable belongs to by using a set of independent variables employing a logistic function. Multinomial Logistic Regression (MLR) is an extension of binomial Logistic Regression that allows for more than two categories of the dependent variable. Unlike Naive Bayes, there is no requirement for statistical independence between independent variables, though there is an assumption of collinearity [40, 41]. We used Logistic Regression in both Weka and PySpark. Logistic Regression with the Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (LBFGS) is the version used in our study to improve memory usage [42]. In this research, we will be using data for both binomial and mulitnomial Logistic Regression.

Support Vector Machine models create a space consisting of training instances portraying them as points, mapping them in a way that best creates linearly separable categories, where the goal is to have the largest gap between them. The specific implementation of SVM used in Weka is called Sequential Minimal Optimization (SMO), which uses this optimization algorithm as the training method for Support Vector Classification [40, 43].

Random Forest is an ensemble learning method that generates a large number of trees. The class value that appears most often among these trees, or mode, is the class predicted as output from the model. As an ensemble learning method, RF is an aggregation of various tree predictors where each tree within the forest is dependent upon the values dictated by a random vector that is independently sampled [40, 44]. We use the Random Forest learner only in Weka.

Appendix B

Type I error rate (false positive rate) is the percentage of instances that are actually non-fraud but marked as fraud, in relation to the number of actual non-fraud instances. A fire alarm going off indicating a fire when in fact there is no fire would be an example of this kind of error. Type II error rate (false negative rate) is the percentage of instances that are actually fraud but marked as non-fraud, in relation to the actual number of actual fraud instances. As an example, a fire breaking out and the fire alarm does not ring would be considered a false negative. Note that in binary classification, finding a balance between the error rates, while minimizing the Type II error rate, is generally preferred. Recall measures the ability of a classifier to determine the rate of positively marked instances that are in fact positive; therefore, in our study, recall is the fraction of physicians labeled correctly and not as any of the other specialties. Precision indicates how well a classifier has predicted a class by finding the ratio of actually positive instances from the pool of instances that it has marked as part of the positive class; therefore, precision shows the fraction of physicians marked correctly against the number of physicians, from any of the other specialties, also marked as the class in question.

F-score (also known as F1-score or F-measure) is the harmonic mean of both precision and recall, generating a number between 0 and 1, where values closer to one indicate better performance. For this study, we assume equal weighting between precision and recall, with β = 1, as seen in Eq. 1.

$$ F_{1} = (1 + \beta^{2}) \times \frac{Recall \times Precision}{(\beta^{2} \times Recall) + Precision} $$
(1)

G-measure, also known as the Fowlkes-Mallows index, gives the geometric mean of precision and recall giving the central point between the values as seen in Eq. 2.

$$ \text{G-measure} = \sqrt{Recall \times Precision} $$
(2)

Finally, in order to leverage the successes from our prior works, we manipulated the datasets, by filtering out certain specialties only, so that we could test one fraudulent specialty at a time (based on the model predicting the physician’s specialty). Since the test dataset contains only one specialty at a time, the overall accuracy would be the percentage of real-world fraudulent physicians that are considered not fraudulent. In order to capture the the percentages of classes that are labeled as another class, we incorporate the Inverse Overall Accuracy (IOA) performance measure. IOA, where IOA = 1 −overall accuracy, is the percentage of fraudulent physicians marked as fraudulent for a given specialty. As shown in Eq. 3, to calculate the model’s overall weighted average IOA (owaIOA), we take the IOA for a specialty, the number of fraudulent instances (n) for that specialty and the total number of instances (N), and sum over the total number of fraudulent instances between all specialties with F-score of 0.75 or above (NoS).

$$ owaIOA = \sum\limits_{i = 1}^{i=NoS} \frac{n_{i}}{N} \times IOA_{i} $$
(3)

Appendix C

Table 15 Fraud detection results with groups only (group test one)
Table 16 Fraud detection results with groups and non-grouped specialties (group test two)
Table 17 Class removal (original four classes) fraud detection results
Table 18 Class removal (chosen classes) fraud detection results

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Herland, M., Bauder, R.A. & Khoshgoftaar, T.M. Approaches for identifying U.S. medicare fraud in provider claims data. Health Care Manag Sci 23, 2–19 (2020). https://doi.org/10.1007/s10729-018-9460-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10729-018-9460-8

Keywords

Navigation