Network analytics for insurance fraud detection: a critical case study

Deprez, Bruno; Vandervorst, Félix; Verbeke, Wouter; Verdonck, Tim; Baesens, Bart

doi:10.1007/s13385-024-00384-6

Network analytics for insurance fraud detection: a critical case study

Case Study
Published: 14 May 2024

(2024)
Cite this article

European Actuarial Journal Aims and scope Submit manuscript

173 Accesses
Explore all metrics

Abstract

There has been an increasing interest in fraud detection methods, driven by new regulations and by the financial losses linked to fraud. One of the state-of-the-art methods to fight fraud is network analytics. Network analytics leverages the interactions between different entities to detect complex patterns that are indicative of fraud. However, network analytics has only recently been applied to fraud detection in the actuarial literature. Although it shows much potential, many network methods are not yet applied. This paper extends the literature in two main ways. First, we review and apply multiple methods in the context of insurance fraud and assess their predictive power against each other. Second, we analyse the added value of network features over intrinsic features to detect fraud. We conclude that (1) complex methods do not necessarily outperform basic network features, and that (2) network analytics helps to detect different fraud patterns, compared to models trained on claim-specific features alone.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fraud Detection in Networks

Frauds in Online Social Networks: A Review

A graph-based, semi-supervised, credit card fraud detection system

Data Availability

The healthcare provider data is available on kaggle (https://www.kaggle.com/datasets/rohitrox/healthcare-provider-fraud-detection-analysis). The motor insurance data set is proprietary.

Notes

https://www.kaggle.com/datasets/rohitrox/healthcare-provider-fraud-detection-analysis.
https://github.com/B-Deprez/NetworkFraud_BiRank_M2V_SAGE.
Among an insurer’s (independent) companies over state lines.
Among subsidiaries of an insurance company.
Among different agents involved, i.e., hospitals, patients and pharmacies.
Since claim and fraud data are highly sensitive, we only give a rough approximation of the numbers, which can either be rounded up or down. The total number is the sum of these semi-random numbers.
https://www.kaggle.com/datasets/rohitrox/healthcare-provider-fraud-detection-analysis.

References

Abdallah A, Maarof MA, Zainal A (2016) Fraud detection system: a survey. J Netw Comput Appl 68:90–113
Article Google Scholar
Arsov N, Mirceva G (2019) Network embedding: an overview. arXiv:1911.11726
Baesens B, Van Vlasselaer V, Verbeke W (2015) Fraud analytics using descriptive, predictive, and social network techniques: a guide to data science for fraud detection. Wiley, New York
Book Google Scholar
Barabáse AL (2020) Network science, 5th edn. Cambridge University Press, Cambridge
Google Scholar
Bockel-Rickermann C, Verdonck T, Verbeke W (2023) Fraud analytics: a decade of research organizing challenges and solutions in the field. Expert Syst Appl 232:120605
Cai H, Zheng VW, Chang KCC (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Article Google Scholar
Chen C, Liang C, Lin J, et al (2019) Infdetect: a large scale graph-based fraud detection system for e-commerce insurance. In: 2019 IEEE international conference on big data (Big Data). IEEE, pp 1765–1773
CSIRO’s Data61 (2018) Stellargraph machine learning library. https://github.com/stellargraph/stellargraph
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML’06). Pittsburgh, Pennsylvania, USA, PP 233–240
Derrig RA (2002) Insurance fraud. J Risk Insur 69(3):271–287
Article Google Scholar
Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD’17). Halifax, NS, Canada, pp 135–144
EIOPA (2019) Big data analytics in motor and health insurance: a thematic review. Publications Office of the European Union, Luxembourg. https://www.eiopa.europa.eu/document/download/becbbe3a-ba4c-47b9-870a-63872fef3986_en?filename=Big%20Data%20Analytics%20in%20motor%20and%20health%20insurance%3A%20A%20thematic%20review
Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: 2008 Proceedings of the tenth workshop on algorithm engineering and experiments (ALENEX). SIAM, pp 90–100
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl Based Syst 151:78–94
Article Google Scholar
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30:1024–1034
Hamilton WL, Ying R, Leskovec J (2018) Representation learning on graphs: methods and applications. arXiv:1709.05584
He X, Gao M, Kan MY et al (2016) BiRank: towards ranking on bipartite graphs. IEEE Trans Knowl Data Eng 29(1):57–71
Article Google Scholar
Hou M, Ren J, Zhang D et al (2020) Network embedding: taxonomies, frameworks and applications. Comput Sci Rev 38:100296
Article MathSciNet Google Scholar
Insurance Europe (2019) Insurance fraud: not a victimless crime. https://www.insuranceeurope.eu/publications/703/insurance-fraud-not-a-victimless-crime/. Accessed 10 Jan 2023
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Koutra D, Ke TY, Kang U, et al (2011) Unifying guilt-by-association approaches: theorems and fast algorithms. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 245–260
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444
Article Google Scholar
Menon NM (2015) Information spillovers and semicollaborative networks in insurer fraud detection. MIS Q 42(2):407–426
Mikolov T, Chen K, Corrado G, et al (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Newman M (2010) Networks: an introduction. Oxford University Press, Oxford. https://doi.org/10.1093/acprof:oso/9780199206650.001.0001
Book Google Scholar
Óskarsdóttir M, Ahmed W, Antonio K et al (2022) Social network analytics for supervised fraud detection in insurance. Risk Anal 42(8):1872–1890
Article Google Scholar
Ozenne B, Subtil F, Maucort-Boulch D (2015) The precision-recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol 68(8):855–859
Article Google Scholar
Page L, Brin S, Motwani R et al (1999) The PageRank citation ranking: bringing order to the web. Tech. rep, Stanford InfoLab
Park J, Barabási AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104(46):17916–17920
Article Google Scholar
Peng J, Li Q, Li H, et al (2018) Fraud detection of medical insurance employing outlier analysis. In: 2018 IEEE 22nd international conference on computer supported cooperative work in design ((CSCWD)). IEEE, pp 341–346
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’14). New York, USA, pp 701–710
Scarselli F, Yong SL, Gori M, et al (2005) Graph neural networks for ranking web pages. In: The 2005 IEEE/WIC/ACM international conference on web intelligence (WI’05). Compiegne, France, pp 666–672
Scarselli F, Gori M, Tsoi AC et al (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Article Google Scholar
Šubelj L, Furlan Š, Bajec M (2011) An expert system for detecting automobile insurance fraud using social network analysis. Expert Syst Appl 38(1):1039–1052
Article Google Scholar
Sun C, Li Q, Cui L et al (2018) Heterogeneous network-based chronic disease progression mining. Big Data Min Anal 2(1):25–34
Article Google Scholar
Sun C, Yan Z, Li Q et al (2018) Abnormal group-based joint medical fraud detection. IEEE Access 7:13589–13596
Article Google Scholar
Tumminello M, Consiglio A, Vassallo P et al (2022) Insurance fraud detection: a statistically validated network approach. J Risk Insur 90(2):381–419
Van Belle R, Van Damme C, Tytgat H et al (2022) Inductive graph representation learning for fraud detection. Expert Syst Appl 193:116463
Article Google Scholar
Van Belle R, Baesens B, De Weerdt J (2023) CATCHM: a novel network-based credit card fraud detection method using node representation learning. Decis Support Syst 164:113866
Article Google Scholar
Van Vlasselaer V, Bravo C, Caelen O et al (2015) APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis Support Syst 75:38–48
Article Google Scholar
Van Vlasselaer V, Eliassi-Rad T, Akoglu L et al (2017) Gotcha! network-based fraud detection for social security fraud. Manag Sci 63(9):3090–3110
Article Google Scholar
Veličković P, Cucurull G, Casanova A, et al (2018) Graph attention networks. arXiv:1710.10903
Verbeke W, Martens D, Baesens B (2014) Social network analysis for customer churn prediction. Appl Soft Comput 14:431–446
Article Google Scholar
Xiao S, Bai T, Cui X, et al (2022) A graph-based contrastive learning framework for medicare insurance fraud detection. Front Comput Sci 17(2):172341
Yoo Y, Shin J, Kyeong S (2023) Medicare fraud detection using graph analysis: a comparative study of machine learning and graph neural networks. IEEE Access 11:88278–88294
Zhao B, Shi Y, Zhang K, et al (2019) Health insurance anomaly detection based on dynamic heterogeneous information network. In: 2019 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, pp 1118–1122

Download references

Funding

This work was supported by the Research Foundation—Flanders (FWO research project 1SHEN24N).

Author information

Authors and Affiliations

Faculty of Economics and Business, KU Leuven, Naamsestraat 69, 3000, Leuven, Belgium
Bruno Deprez, Félix Vandervorst, Wouter Verbeke & Bart Baesens
Department of Mathematics, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
Bruno Deprez, Félix Vandervorst & Tim Verdonck
Department of Mathematics, KU Leuven, Celestijnenlaan 200B, 3001, Leuven, Belgium
Tim Verdonck
Data Office, Allianz Benelux, Koning Albert II Laan 32, 1000, Brussels, Belgium
Félix Vandervorst
Department of Decision Analytics and Risk, University of Southampton, University Road, Southampton, SO17 1BJ, UK
Bart Baesens

Authors

Bruno Deprez
View author publications
You can also search for this author in PubMed Google Scholar
Félix Vandervorst
View author publications
You can also search for this author in PubMed Google Scholar
Wouter Verbeke
View author publications
You can also search for this author in PubMed Google Scholar
Tim Verdonck
View author publications
You can also search for this author in PubMed Google Scholar
Bart Baesens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Deprez.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the author(s).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Deprez, B., Vandervorst, F., Verbeke, W. et al. Network analytics for insurance fraud detection: a critical case study. Eur. Actuar. J. (2024). https://doi.org/10.1007/s13385-024-00384-6

Download citation

Received: 18 December 2023
Revised: 22 February 2024
Accepted: 09 April 2024
Published: 14 May 2024
DOI: https://doi.org/10.1007/s13385-024-00384-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Network analytics for insurance fraud detection: a critical case study

Abstract

Access this article

Similar content being viewed by others

Fraud Detection in Networks

Frauds in Online Social Networks: A Review

A graph-based, semi-supervised, credit card fraud detection system

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Network analytics for insurance fraud detection: a critical case study

Abstract

Access this article

Similar content being viewed by others

Fraud Detection in Networks

Frauds in Online Social Networks: A Review

A graph-based, semi-supervised, credit card fraud detection system

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation