Abstract
There has been an increasing interest in fraud detection methods, driven by new regulations and by the financial losses linked to fraud. One of the state-of-the-art methods to fight fraud is network analytics. Network analytics leverages the interactions between different entities to detect complex patterns that are indicative of fraud. However, network analytics has only recently been applied to fraud detection in the actuarial literature. Although it shows much potential, many network methods are not yet applied. This paper extends the literature in two main ways. First, we review and apply multiple methods in the context of insurance fraud and assess their predictive power against each other. Second, we analyse the added value of network features over intrinsic features to detect fraud. We conclude that (1) complex methods do not necessarily outperform basic network features, and that (2) network analytics helps to detect different fraud patterns, compared to models trained on claim-specific features alone.
Similar content being viewed by others
Data Availability
The healthcare provider data is available on kaggle (https://www.kaggle.com/datasets/rohitrox/healthcare-provider-fraud-detection-analysis). The motor insurance data set is proprietary.
Notes
Among an insurer’s (independent) companies over state lines.
Among subsidiaries of an insurance company.
Among different agents involved, i.e., hospitals, patients and pharmacies.
Since claim and fraud data are highly sensitive, we only give a rough approximation of the numbers, which can either be rounded up or down. The total number is the sum of these semi-random numbers.
References
Abdallah A, Maarof MA, Zainal A (2016) Fraud detection system: a survey. J Netw Comput Appl 68:90–113
Arsov N, Mirceva G (2019) Network embedding: an overview. arXiv:1911.11726
Baesens B, Van Vlasselaer V, Verbeke W (2015) Fraud analytics using descriptive, predictive, and social network techniques: a guide to data science for fraud detection. Wiley, New York
Barabáse AL (2020) Network science, 5th edn. Cambridge University Press, Cambridge
Bockel-Rickermann C, Verdonck T, Verbeke W (2023) Fraud analytics: a decade of research organizing challenges and solutions in the field. Expert Syst Appl 232:120605
Cai H, Zheng VW, Chang KCC (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Chen C, Liang C, Lin J, et al (2019) Infdetect: a large scale graph-based fraud detection system for e-commerce insurance. In: 2019 IEEE international conference on big data (Big Data). IEEE, pp 1765–1773
CSIRO’s Data61 (2018) Stellargraph machine learning library. https://github.com/stellargraph/stellargraph
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML’06). Pittsburgh, Pennsylvania, USA, PP 233–240
Derrig RA (2002) Insurance fraud. J Risk Insur 69(3):271–287
Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD’17). Halifax, NS, Canada, pp 135–144
EIOPA (2019) Big data analytics in motor and health insurance: a thematic review. Publications Office of the European Union, Luxembourg. https://www.eiopa.europa.eu/document/download/becbbe3a-ba4c-47b9-870a-63872fef3986_en?filename=Big%20Data%20Analytics%20in%20motor%20and%20health%20insurance%3A%20A%20thematic%20review
Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: 2008 Proceedings of the tenth workshop on algorithm engineering and experiments (ALENEX). SIAM, pp 90–100
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl Based Syst 151:78–94
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30:1024–1034
Hamilton WL, Ying R, Leskovec J (2018) Representation learning on graphs: methods and applications. arXiv:1709.05584
He X, Gao M, Kan MY et al (2016) BiRank: towards ranking on bipartite graphs. IEEE Trans Knowl Data Eng 29(1):57–71
Hou M, Ren J, Zhang D et al (2020) Network embedding: taxonomies, frameworks and applications. Comput Sci Rev 38:100296
Insurance Europe (2019) Insurance fraud: not a victimless crime. https://www.insuranceeurope.eu/publications/703/insurance-fraud-not-a-victimless-crime/. Accessed 10 Jan 2023
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Koutra D, Ke TY, Kang U, et al (2011) Unifying guilt-by-association approaches: theorems and fast algorithms. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 245–260
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444
Menon NM (2015) Information spillovers and semicollaborative networks in insurer fraud detection. MIS Q 42(2):407–426
Mikolov T, Chen K, Corrado G, et al (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Newman M (2010) Networks: an introduction. Oxford University Press, Oxford. https://doi.org/10.1093/acprof:oso/9780199206650.001.0001
Óskarsdóttir M, Ahmed W, Antonio K et al (2022) Social network analytics for supervised fraud detection in insurance. Risk Anal 42(8):1872–1890
Ozenne B, Subtil F, Maucort-Boulch D (2015) The precision-recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol 68(8):855–859
Page L, Brin S, Motwani R et al (1999) The PageRank citation ranking: bringing order to the web. Tech. rep, Stanford InfoLab
Park J, Barabási AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104(46):17916–17920
Peng J, Li Q, Li H, et al (2018) Fraud detection of medical insurance employing outlier analysis. In: 2018 IEEE 22nd international conference on computer supported cooperative work in design ((CSCWD)). IEEE, pp 341–346
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’14). New York, USA, pp 701–710
Scarselli F, Yong SL, Gori M, et al (2005) Graph neural networks for ranking web pages. In: The 2005 IEEE/WIC/ACM international conference on web intelligence (WI’05). Compiegne, France, pp 666–672
Scarselli F, Gori M, Tsoi AC et al (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Šubelj L, Furlan Š, Bajec M (2011) An expert system for detecting automobile insurance fraud using social network analysis. Expert Syst Appl 38(1):1039–1052
Sun C, Li Q, Cui L et al (2018) Heterogeneous network-based chronic disease progression mining. Big Data Min Anal 2(1):25–34
Sun C, Yan Z, Li Q et al (2018) Abnormal group-based joint medical fraud detection. IEEE Access 7:13589–13596
Tumminello M, Consiglio A, Vassallo P et al (2022) Insurance fraud detection: a statistically validated network approach. J Risk Insur 90(2):381–419
Van Belle R, Van Damme C, Tytgat H et al (2022) Inductive graph representation learning for fraud detection. Expert Syst Appl 193:116463
Van Belle R, Baesens B, De Weerdt J (2023) CATCHM: a novel network-based credit card fraud detection method using node representation learning. Decis Support Syst 164:113866
Van Vlasselaer V, Bravo C, Caelen O et al (2015) APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis Support Syst 75:38–48
Van Vlasselaer V, Eliassi-Rad T, Akoglu L et al (2017) Gotcha! network-based fraud detection for social security fraud. Manag Sci 63(9):3090–3110
Veličković P, Cucurull G, Casanova A, et al (2018) Graph attention networks. arXiv:1710.10903
Verbeke W, Martens D, Baesens B (2014) Social network analysis for customer churn prediction. Appl Soft Comput 14:431–446
Xiao S, Bai T, Cui X, et al (2022) A graph-based contrastive learning framework for medicare insurance fraud detection. Front Comput Sci 17(2):172341
Yoo Y, Shin J, Kyeong S (2023) Medicare fraud detection using graph analysis: a comparative study of machine learning and graph neural networks. IEEE Access 11:88278–88294
Zhao B, Shi Y, Zhang K, et al (2019) Health insurance anomaly detection based on dynamic heterogeneous information network. In: 2019 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 1118–1122
Funding
This work was supported by the Research Foundation—Flanders (FWO research project 1SHEN24N).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the author(s).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deprez, B., Vandervorst, F., Verbeke, W. et al. Network analytics for insurance fraud detection: a critical case study. Eur. Actuar. J. (2024). https://doi.org/10.1007/s13385-024-00384-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13385-024-00384-6