Skip to main content

On Anomaly Detection in Graphs as Node Classification

  • Conference paper
  • First Online:
Big Data Management and Analysis for Cyber Physical Systems (BDET 2022)

Abstract

Graphs have been exercised as appealing candidates for modeling relational datasets in different domains such as cryptocurrency transaction networks, social networks, rating platforms, and many more. Recently, different powerful methods have emerged to analyze datasets where the complex underlying data connectivity can be modeled by graphs. These methods have demonstrated promising performance on graph common tasks including node classification. In this paper, we explore the impact of graph-based techniques for detecting anomalous entities on real-world networks. We focus on modeling the problem of detecting anomalous entities on a network as a node classification task, and inspect the role of different approaches together with the evaluation setup and metrics to provide several useful recommendations for practical applications. We investigate different ways of handling the imbalance issue of the datasets which is a common problem when dealing with datasets containing anomalies, and demonstrate how a method that is agnostic to the dataset imbalance may show misleading performance. Through extensive experiments on six real-world datasets in balanced and unbalanced setting for a node classification task, we provide several recommendations that can shed more lights on challenges of selecting the appropriate methods, settings, and performance metrics that better align with the intrinsic attributes of a specific dataset and task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Akcora, C.G., Li, Y., Gel, Y.R., Kantarcioglu, M.: Bitcoinheist: topological data analysis for ransomware prediction on the bitcoin blockchain. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI) (2020)

    Google Scholar 

  2. Chami, I., Abu-El-Haija, S., Perozzi, B., Ré, C., Murphy, K.: Machine learning on graphs: a model and comprehensive taxonomy. arXiv preprint arXiv:2005.03675 (2020)

  3. Chen, W., Guo, X., Chen, Z., Zheng, Z., Lu, Y.: Phishing scam detection on ethereum: towards financial security for blockchain ecosystem. In: International Joint Conferences on Artificial Intelligence Organization, pp. 4506–4512 (2020)

    Google Scholar 

  4. Farrugia, S., Ellul, J., Azzopardi, G.: Detection of illicit accounts over the ethereum blockchain. Expert Syst. Appl. 150, 113318 (2020)

    Google Scholar 

  5. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)

    Google Scholar 

  6. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035 (2017)

    Google Scholar 

  7. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  8. Kumar, S., Hooi, B., Makhija, D., Kumar, M., Faloutsos, C., Subrahmanian, V.: Rev2: fraudulent user prediction in rating platforms. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 333–341 (2018)

    Google Scholar 

  9. Longadge, R., Dongre, S.: Class imbalance problem in data mining review. arXiv preprint arXiv:1305.1707 (2013)

  10. Ma, X., et al.: A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans. Knowl. Data Eng. (2021)

    Google Scholar 

  11. Ma, X., Qin, G., Qiu, Z., Zheng, M., Wang, Z.: Riwalk: fast structural node embedding via role identification. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 478–487. IEEE (2019)

    Google Scholar 

  12. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural. Inf. Process. Syst. 26, 3111–3119 (2013)

    Google Scholar 

  13. Nerurkar, P., Bhirud, S., Patel, D., Ludinard, R., Busnel, Y., Kumari, S.: Supervised learning model for identifying illegal activities in Bitcoin. Appl. Intell. 51(6), 3824–3843 (2020). https://doi.org/10.1007/s10489-020-02048-w

    Article  Google Scholar 

  14. Poursafaei, F., Rabbany, R., Zilic, Z.: SigTran: signature vectors for detecting illicit activities in blockchain transaction networks. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12712, pp. 27–39. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75762-5_3

    Chapter  Google Scholar 

  15. Sakr, S., et al.: The future is big graphs: a community view on graph processing systems. Commun. ACM 64(9), 62–71 (2021)

    Article  Google Scholar 

  16. Weber, M., et al.: Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019)

  17. Wu, J., et al.: Who are the phishers? phishing scam detection on ethereum via network embedding. IEEE Trans. Syst. Man Cybern. Syst. (2020)

    Google Scholar 

  18. Zhao, T., Zhang, X., Wang, S.: Graphsmote: imbalanced node classification on graphs with graph neural networks. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 833–841 (2021)

    Google Scholar 

  19. Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farimah Poursafaei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Poursafaei, F., Zilic, Z., Rabbany, R. (2023). On Anomaly Detection in Graphs as Node Classification. In: Tang, L.C., Wang, H. (eds) Big Data Management and Analysis for Cyber Physical Systems. BDET 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 150. Springer, Cham. https://doi.org/10.1007/978-3-031-17548-0_2

Download citation

Publish with us

Policies and ethics