Abstract
Fraud in healthcare is widespread, as doctors could prescribe unnecessary treatments to increase bills. Insurance companies want to detect these anomalous fraudulent bills and reduce their losses. Traditional fraud detection methods use expert rules and manual data processing. Recently, machine learning techniques automate this process, but hand-labeled data is extremely costly and usually out of date. We propose a machine learning model that automates fraud detection in an unsupervised way. Two deep learning approaches include LSTM neural network for prediction next patient visit and a seq2seq model. For normalization of produced anomaly scores, we propose Empirical Distribution Function (EDF) approach. So, the algorithm works with high class imbalance problems.
We use real data on sequences of patients’ visits data from Allianz company for the validation. The models provide state-of-the-art results for unsupervised anomaly detection for fraud detection in healthcare. Our EDF approach further improves the quality of LSTM model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bauder, R.A., Khoshgoftaar, T.M.: Medicare fraud detection using machine learning methods, pp. 858–865 (2017)
Bauder, R., Khoshgoftaar, T., Seliya, N.: A survey on the state of healthcare upcoding fraud analysis and detection. Health Serv. Outcomes Res. Method. 17, 07 (2016)
Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 1–50 (2016)
Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S., Napoli, A., Raïssi, C.: On mining complex sequential data by means of FCA and pattern structures. Int. J. Gen. Syst. 45 135–159 (2016)
R. Chalapathy and S. Chawla. Deep learning for anomaly detection: A survey. arXiv:1901.03407 (2019)
Christopher, I.L.: National cancer institute’s surveillance epidemiology and end results (seer) data analysis from nine population-based us cancer registries. JAMA 289, 1421–1424 (2003)
Farbmacher, H., Löw, L., Spindler, M.: An explainable attention network for fraud detection in claims management. Technical report, Technical Report, University of Hamburg (2019)
Fursov, I., Zaytsev, A., Khasyanov, R., Spindler, M., Burnaev, E.: Sequence embeddings help to identify fraudulent cases in healthcare insurance. ArXiv, abs/1910.03072 (2019)
Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manage. 45, 289–307 (2019)
Herland, M., Bauder, R.A., Khoshgoftaar, T.M.: Approaches for identifying U.S. medicare fraud in provider claims data. Health Care Manage. Sci. 23(1), 2–19 (2018). https://doi.org/10.1007/s10729-018-9460-8
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., Söderström, T.: Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)
Kiran, B.R., Thomas, D.M., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)
Kozlovskaia, N., Zaytsev, A.: Deep ensembles for imbalanced classification. In: IEEE ICMLA, pp. 908–913. IEEE (2017)
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J.: A survey of deep learning-based network anomaly detection. Cluster Comput. 1–13 (2019)
Qiuxia L., Wenguan W., Khan, S., Shen, J., Sun, H., Shao, L.: Human vs machine attention in neural networks: A comparative study. ArXiv, abs/1906.08764, (2019)
Liu, F.T., Ting, K., Zhou, Z.: Isolation forest, pp. 413–422 (2009)
Miljković, D.: Review of novelty detection methods. In: Proceedings of the 33rd International Convention (MIPRO), pp. 593–598 (2010)
Krishnan, N., Vukosi, N.M.: Unsupervised anomaly detection of healthcare providers using generative adversarial networks. Responsible Design, Implementation and Use of Information and Communication Technology, p. 12066 (2020)
Poelmans, J., Dedene, G., Verheyden, G., Van der Mussele, H., Viaene, S., Peters, E.: Combining business process and data discovery techniques for analyzing and improving integrated care pathways. In: Perner, P. (ed.) ICDM 2010. LNCS (LNAI), vol. 6171, pp. 505–517. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14400-4_39
Poelmans, J., Elzinga, P., Ignatov, D., Kuznetsov, S.: Semi-automated knowledge discovery: identifying and profiling human trafficking. Int. J. Gen. Syst. 41, 11 (2012)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. Trans. Sig. Proc. 45(11), 2673–2681 (1997)
Sherstinsky, A.: Fundamentals of recurrent neural network and long short-term memory network. Phys. D Nonlinear Phenomena 404, 132306 (2020)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014
Wiewel, F., Yang, B.: Continual learning for anomaly detection with variational autoencoder. In: IEEE ICASSP, pp. 3837–3841. IEEE (2019)
Zimmerer, D., Kohl, S., Petersen, J., Isensee, F., Maier-Hein, K.: Context-encoding variational autoencoder for unsupervised anomaly detection. arXiv preprint arXiv:1812.05941 (2018)
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection (2018)
Acknowledgments
We thank Martin Spindler for providing the data and Ivan Fursov for providing code for data processing. This work was supported by the federal program “Research and development in priority areas for the development of the scientific and technological complex of Russia for 2014–2020” via grant RFMEFI60619X0008.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Snorovikhina, V., Zaytsev, A. (2021). Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data. In: van der Aalst, W.M.P., et al. Analysis of Images, Social Networks and Texts. AIST 2020. Lecture Notes in Computer Science(), vol 12602. Springer, Cham. https://doi.org/10.1007/978-3-030-72610-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-72610-2_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72609-6
Online ISBN: 978-3-030-72610-2
eBook Packages: Computer ScienceComputer Science (R0)