Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data

Snorovikhina, Victoria; Zaytsev, Alexey

doi:10.1007/978-3-030-72610-2_30

Victoria Snorovikhina²³ &
Alexey Zaytsev²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12602))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

918 Accesses
2 Citations

Abstract

Fraud in healthcare is widespread, as doctors could prescribe unnecessary treatments to increase bills. Insurance companies want to detect these anomalous fraudulent bills and reduce their losses. Traditional fraud detection methods use expert rules and manual data processing. Recently, machine learning techniques automate this process, but hand-labeled data is extremely costly and usually out of date. We propose a machine learning model that automates fraud detection in an unsupervised way. Two deep learning approaches include LSTM neural network for prediction next patient visit and a seq2seq model. For normalization of produced anomaly scores, we propose Empirical Distribution Function (EDF) approach. So, the algorithm works with high class imbalance problems.

We use real data on sequences of patients’ visits data from Allianz company for the validation. The models provide state-of-the-art results for unsupervised anomaly detection for fraud detection in healthcare. Our EDF approach further improves the quality of LSTM model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bauder, R.A., Khoshgoftaar, T.M.: Medicare fraud detection using machine learning methods, pp. 858–865 (2017)
Google Scholar
Bauder, R., Khoshgoftaar, T., Seliya, N.: A survey on the state of healthcare upcoding fraud analysis and detection. Health Serv. Outcomes Res. Method. 17, 07 (2016)
Google Scholar
Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 1–50 (2016)
Article Google Scholar
Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S., Napoli, A., Raïssi, C.: On mining complex sequential data by means of FCA and pattern structures. Int. J. Gen. Syst. 45 135–159 (2016)
Google Scholar
R. Chalapathy and S. Chawla. Deep learning for anomaly detection: A survey. arXiv:1901.03407 (2019)
Christopher, I.L.: National cancer institute’s surveillance epidemiology and end results (seer) data analysis from nine population-based us cancer registries. JAMA 289, 1421–1424 (2003)
Article Google Scholar
Farbmacher, H., Löw, L., Spindler, M.: An explainable attention network for fraud detection in claims management. Technical report, Technical Report, University of Hamburg (2019)
Google Scholar
Fursov, I., Zaytsev, A., Khasyanov, R., Spindler, M., Burnaev, E.: Sequence embeddings help to identify fraudulent cases in healthcare insurance. ArXiv, abs/1910.03072 (2019)
Google Scholar
Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manage. 45, 289–307 (2019)
Article Google Scholar
Herland, M., Bauder, R.A., Khoshgoftaar, T.M.: Approaches for identifying U.S. medicare fraud in provider claims data. Health Care Manage. Sci. 23(1), 2–19 (2018). https://doi.org/10.1007/s10729-018-9460-8
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., Söderström, T.: Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)
Google Scholar
Kiran, B.R., Thomas, D.M., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)
Article Google Scholar
Kozlovskaia, N., Zaytsev, A.: Deep ensembles for imbalanced classification. In: IEEE ICMLA, pp. 908–913. IEEE (2017)
Google Scholar
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J.: A survey of deep learning-based network anomaly detection. Cluster Comput. 1–13 (2019)
Google Scholar
Qiuxia L., Wenguan W., Khan, S., Shen, J., Sun, H., Shao, L.: Human vs machine attention in neural networks: A comparative study. ArXiv, abs/1906.08764, (2019)
Google Scholar
Liu, F.T., Ting, K., Zhou, Z.: Isolation forest, pp. 413–422 (2009)
Google Scholar
Miljković, D.: Review of novelty detection methods. In: Proceedings of the 33rd International Convention (MIPRO), pp. 593–598 (2010)
Google Scholar
Krishnan, N., Vukosi, N.M.: Unsupervised anomaly detection of healthcare providers using generative adversarial networks. Responsible Design, Implementation and Use of Information and Communication Technology, p. 12066 (2020)
Google Scholar
Poelmans, J., Dedene, G., Verheyden, G., Van der Mussele, H., Viaene, S., Peters, E.: Combining business process and data discovery techniques for analyzing and improving integrated care pathways. In: Perner, P. (ed.) ICDM 2010. LNCS (LNAI), vol. 6171, pp. 505–517. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14400-4_39
Chapter Google Scholar
Poelmans, J., Elzinga, P., Ignatov, D., Kuznetsov, S.: Semi-automated knowledge discovery: identifying and profiling human trafficking. Int. J. Gen. Syst. 41, 11 (2012)
Article MathSciNet Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. Trans. Sig. Proc. 45(11), 2673–2681 (1997)
Article Google Scholar
Sherstinsky, A.: Fundamentals of recurrent neural network and long short-term memory network. Phys. D Nonlinear Phenomena 404, 132306 (2020)
Article MathSciNet Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014
Google Scholar
Wiewel, F., Yang, B.: Continual learning for anomaly detection with variational autoencoder. In: IEEE ICASSP, pp. 3837–3841. IEEE (2019)
Google Scholar
Zimmerer, D., Kohl, S., Petersen, J., Isensee, F., Maier-Hein, K.: Context-encoding variational autoencoder for unsupervised anomaly detection. arXiv preprint arXiv:1812.05941 (2018)
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection (2018)
Google Scholar

Download references

Acknowledgments

We thank Martin Spindler for providing the data and Ivan Fursov for providing code for data processing. This work was supported by the federal program “Research and development in priority areas for the development of the scientific and technological complex of Russia for 2014–2020” via grant RFMEFI60619X0008.

Author information

Authors and Affiliations

Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
Victoria Snorovikhina & Alexey Zaytsev

Authors

Victoria Snorovikhina
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Zaytsev
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, St. Petersburg, Russia
Olessia Koltsova
University of Oslo, Oslo, Norway
Andrey Kutuzov
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
LORIA, Vandœuvre lès Nancy, France
Amedeo Napoli
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Università Ca' Foscari Venezia, Venice, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Kazan Federal University, Kazan, Russia
Elena Tutubalina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Snorovikhina, V., Zaytsev, A. (2021). Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data. In: van der Aalst, W.M.P., et al. Analysis of Images, Social Networks and Texts. AIST 2020. Lecture Notes in Computer Science(), vol 12602. Springer, Cham. https://doi.org/10.1007/978-3-030-72610-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-72610-2_30
Published: 09 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72609-6
Online ISBN: 978-3-030-72610-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data