Skip to main content
Log in

Improving fraud detection via imbalanced graph structure learning

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Graph-based fraud detection methods have recently attracted much attention due to the rich relational information of graph-structured data, which may facilitate the detection of fraudsters. However, the GNN-based algorithms may exhibit unsatisfactory performance faced with graph heterophily as the fraudsters usually disguise themselves by deliberately making extensive connections to normal users. In addition to this, the class imbalance problem also causes GNNs to overfit normal users and perform poorly for fraudsters. To address these problems, we propose an Imbalanced Graph Structure Learning framework for fraud detection (IGSL for short). Specifically, nodes are picked with a devised multi-relational class-balanced sampler for mini-batch training. Then, an iterative graph structure learning module is proposed to iteratively construct a global homophilic adjacency matrix in the embedding domain. Further, an anchor node message passing mechanism is proposed to reduce the computational complexity of the constructing homophily adjacency matrix. Extensive experiments on benchmark datasets show that IGSL achieves significantly better performance even when the graph is heavily heterophilic and imbalanced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility statement

The datasets are available from the corresponding author on reasonable request.

References

  • Abu-El-Haija, S., Perozzi, B., Kapoor, A., Alipourfard, N., Lerman, K., Harutyunyan, H., Ver Steeg, G., & Galstyan, A. (2019). Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In International conference on machine learning (pp. 21–29). PMLR.

  • Chen, Y., Wu, L., & Zaki, M. (2020). Iterative deep graph learning for graph neural networks: Better and robust node embeddings. Advances in Neural Information Processing Systems, 33, 19314–19326.

    Google Scholar 

  • Chien, E., Peng, J., Li, P., & Milenkovic, O. (2020). Adaptive universal generalized pagerank graph neural network. In International conference on learning representations.

  • Corizzo, R., & Slenn, T. (2022). Distributed node classification with graph attention networks. In 2022 IEEE international conference on big data (big data) (pp. 3720–3725). IEEE.

  • Dou, Y., Liu, Z., Sun, L., Deng, Y., Peng, H., & Yu, P. S. (2020). Enhancing graph neural network-based fraud detectors against camouflaged fraudsters. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 315–324).

  • Ge, S., Ma, G., Xie, S., & Philip, S. Y. (2018). Securing behavior-based opinion spam detection. In 2018 IEEE international conference on big data (big data) (pp. 112–117). IEEE

  • Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing Systems 30.

  • Huang, M., Liu, Y., Ao, X., Li, K., Chi, J., Feng, J., Yang, H., & He, Q. (2022). Auc-oriented graph neural network for fraud detection. In Proceedings of the ACM web conference 2022 (pp. 1311–1321).

  • Hussein, R., Yang, D., & Cudré-Mauroux, P. (2018). Are meta-paths necessary? Revisiting heterogeneous graph embeddings. In Proceedings of the 27th ACM international conference on information and knowledge management (pp. 437–446).

  • Jiang, Y., Liu, G., Wu, J., & Lin, H. (2022). Telecom fraud detection via Hawkes-enhanced sequence model. IEEE Transactions on Knowledge and Data Engineering.

  • Jin, W., Ma, Y., Liu, X., Tang, X., Wang, S., & Tang, J. (2020). Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 66–74).

  • Kaghazgaran, P., Alfifi, M., & Caverlee, J. (2019). Wide-ranging review manipulation attacks: Model, empirical study, and countermeasures. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 981–990).

  • Kipf, T.N., & Welling, M. (2017) Semi-supervised classification with graph convolutional networks. In International conference on learning representations.

  • Liu, Y., Ao, X., Qin, Z., Chi, J., Feng, J., Yang, H., & He, Q. (2021). Pick and choose: a GNN-based imbalanced learning approach for fraud detection. In Proceedings of the Web Conference 2021 (pp. 3168–3177).

  • Liu, Z., Dou, Y., Yu, P.S., Deng, Y., & Peng, H. (2020). Alleviating the inconsistency problem of applying graph neural network to fraud detection. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval (pp. 1569–1572).

  • Liu, C., Sun, L., Ao, X., Feng, J., He, Q., & Yang, H. (2021) Intention-aware heterogeneous graph attention networks for fraud transactions detection. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 3280–3288).

  • Luque, A., Carrasco, A., Martín, A., & de Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91, 216–231.

    Article  ADS  Google Scholar 

  • Manaskasemsak, B., Tantisuwankul, J., & Rungsawang, A. (2021). Fake review and reviewer detection through behavioral graph partitioning integrating deep neural network. Neural Computing and Applications (pp. 1–14).

  • McAuley, J.J., & Leskovec, J. (2013) From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd international conference on World Wide Web (pp. 897–908).

  • Pei, H., Wei, B., Chang, K.C.-C., Lei, Y., & Yang, B. (2019). Geom-GCN: geometric graph convolutional networks. In International conference on learning representations.

  • Petković, M., Ceci, M., Pio, G., Škrlj, B., Kersting, K., & Džeroski, S. (2022). Relational tree ensembles and feature rankings. Knowledge-Based Systems, 251, 109254.

    Article  Google Scholar 

  • Rao, S.X., Lanfranchi, C., Zhang, S., Han, Z., Zhang, Z., Min, W., Cheng, M., Shan, Y., Zhao, Y., & Zhang, C. (2022). Modelling graph dynamics in fraud detection with" attention". International conference on learning representations.

  • Rayana, S., & Akoglu, L. (2015) Collective opinion spam detection: Bridging review networks and metadata. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 985–994).

  • Shi, M., Tang, Y., Zhu, X., Wilson, D., & Liu, J. (2020). Multi-class imbalanced graph convolutional network learning. In Proceedings of the twenty-ninth international joint conference on artificial intelligence (IJCAI-20).

  • Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). Graph attention networks. In International conference on learning representations.

  • Van Vlasselaer, V., Eliassi-Rad, T., Akoglu, L., Snoeck, M., & Baesens, B. (2017). Gotcha! network-based fraud detection for social security fraud. Management Science, 63(9), 3090–3110.

    Article  Google Scholar 

  • Wang, D., Lin, J., Cui, P., Jia, Q., Wang, Z., Fang, Y., Yu, Q., Zhou, J., Yang, S., & Qi, Y. (2019). A semi-supervised graph attentive network for financial fraud detection. In 2019 IEEE international conference on data mining (ICDM) (pp. 598–607). IEEE.

  • Wu, H., Wang, C., Tyshetskiy, Y., Docherty, A., Lu, K., & Zhu, L. (2019). Adversarial examples for graph data: Deep insights into attack and defense. In Proceedings of the 28th international joint conference on artificial intelligence (pp. 4816–4823)

  • Xu, H., Duan, Z., Wang, Y., Feng, J., Chen, R., Zhang, Q., & Xu, Z. (2021). Graph partitioning and graph neural network based hierarchical graph matching for graph similarity computation. Neurocomputing, 439, 348–362.

    Article  Google Scholar 

  • Zeng, H., Zhou, H., Srivastava, A., Kannan, R., & Prasanna, V. (2019). Graphsaint: Graph sampling based inductive learning method. In International conference on learning representations.

  • Zhang, G., Wu, J., Yang, J., Beheshti, A., Xue, S., Zhou, C., & Sheng, Q. Z. (2021). Fraudre: Fraud detection dual-resistant to graph inconsistency and imbalance. In 2021 IEEE international conference on data mining (ICDM) (pp. 867–876). IEEE.

  • Zhang, J., Yang, F., Lin, K., & Lai, Y. (2022). Hierarchical multi-modal fusion on dynamic heterogeneous graph for health insurance fraud detection. In 2022 IEEE international conference on multimedia and expo (ICME) (pp. 1–6.) IEEE.

  • Zhong, Q., Liu, Y., Ao, X., Hu, B., Feng, J., Tang, J., & He, Q. (2020). Financial defaulter detection on online credit payment via multi-view attributed heterogeneous information network. In Proceedings of the web conference 2020 (pp. 785–795).

Download references

Funding

The work is supported by the National Nature Science Foundation of China (No.U22A2035, No.U1803262), National Social Science Foundation of China (No.19ZDA113), Application Foundation Frontier Project of Wuhan Science and Technology Bureau (No.2020010601012288) and National Nature Science Foundation of China (No.U1736206).

Author information

Authors and Affiliations

Authors

Contributions

Authors’ contributions follow the authors’ order convention.

Corresponding author

Correspondence to Ruimin Hu.

Ethics declarations

Conflicts of interest

The authors of this work declare no confict of interest.

Ethics approval

Not applicable.

Consent to participate

Yes.

Consent for publication

Yes.

Code availability

The source code of the current work is available from the corresponding author on reasonable request.

Additional information

Editors: Dino Ienco, Roberto Interdonato, Pascal Poncelet.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, L., Hu, R., Liu, Y. et al. Improving fraud detection via imbalanced graph structure learning. Mach Learn 113, 1069–1090 (2024). https://doi.org/10.1007/s10994-023-06464-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06464-0

Keywords

Navigation