Abstract
As cryptocurrency is widely accepted and used, attendant illegal activities have attracted extensive attention, especially phishing scams, which bring great losses to both customers and countries. From the perspective of crime prevention, early warning of such illegal behaviors is of great significance. However, most existing studies focus on detecting phishing scams that have already occurred and been reported. In addition, previous studies ignore the temporal order of users' appearance and thus cannot accurately extract features reflecting users’ transaction patterns. In this paper, we propose a framework called early-stage phishing detection to address the problem of early phishing detection. According to the phishing amount, we first divide the process of phishing scams into three stages: early stage, middle stage, and late stage. Then, we develop a feature extraction method to capture features from both the local network structures and the time series of transactions. In experiments, the dataset is strictly partitioned by time series, and experimental results show that our proposed method outperforms existing graph embedding methods on a real-world Ethereum transaction dataset. Finally, we select the ten most important features and analyze the differences between phishing users and normal users on these features, which provide useful insights for regulators and platforms to detect phishing scams in advance.
This is a preview of subscription content, access via your institution.







Data availability
The dataset analyzed during the current study is available in the XBlock, one of the blockchain data platforms in the academic community https://xblock.pro/#/.
References
Chang W-H, Chang J-S (2012) An effective early fraud detection method for online auctions. Electron Commer R A 11(4):346–360
Chen L, Peng J, Liu Y, Li J, Xie F, Zheng Z (2020a) Phishing scams detection in ethereum transaction network. ACM Trans Internet Techn 21(1):1–16
Chen T, Li Z, Zhu Y, Chen J, Luo X, Lui JC-S, Lin X, Zhang X (2020b) Understanding ethereum via graph analysis. ACM Trans Internet Techn 20(2):1–32. https://doi.org/10.1145/3381036
Chen W, Guo X, Chen Z, Zheng Z, Lu Y (2020c) Phishing scam detection on ethereum towards financial security for blochchain ecosystem. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Ferretti S, D’Angelo G (2019) On the Ethereum blockchain structure: A complex networks theory perspective. Concurr Comp-Pract E 32:12. https://doi.org/10.1002/cpe.5493
Gao M, Ma L, Liu H, Zhang Z, Ning Z, Xu J (2020) Malicious network traffic detection based on deep neural networks and association analysis. Sensors-Basel 20(5):1452
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. KDD
Guo D, Dong J, Wang K (2019) Graph structure and statistical properties of Ethereum transaction relationships. Inform Sci 492:58–71. https://doi.org/10.1016/j.ins.2019.04.013
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1):389–422
Han W, Cao Y, Bertino E, Yong J (2012) Using automated individual white-list to protect web digital identities. Expert Syst Appl 39(15):11861–11869
Jain AK, Gupta BB (2018) Two-level authentication approach to protect from phishing attacks in real time. J Amb Intel Hum Comp 9(6):1783–1796
Khonji M, Iraqi Y, Jones A (2013) Phishing detection: a literature survey. IEEE Commun Surv Tutor 15(4):2091–2121. https://doi.org/10.1109/surv.2013.032213.00009
Kotsiantis SB, Zaharakis ID, Pintelas PE (2006) Machine learning: a review of classification and combining techniques. Artif Intell Rev 26(3):159–190
Lakhani KR, Iansiti M (2017) The truth about blockchain. Harvard Bus Rev 95(1):119–127
Lee XT, Khan A, Sen Gupta S, Ong YH, Liu X (2020) Measurements, analyses, and insights on the entire ethereum blockchain network. In: Proceedings of the web conference
Li Y, Akcora UIC, Smirnova E, Gel YR, Kantarcioglu M (2020) Dissecting ethereum blockchain analytics what we learn from topology and geometry of the ethereum graph. In: Proceedings of the 2020 SIAM international conference on data mining
Lin D, Wu J, Yuan Q, Zheng Z (2020) Modeling and understanding Ethereum transaction records via a complex network approach. IEEE Trans Circuits-II 67(11):2737–2741. https://doi.org/10.1109/tcsii.2020.2968376
Narayanan A, Chandramohan M, Chen L, Liu Y, Saminathan S (2016) subgraph2vec: learning distributed representations of rooted sub-graphs from large graphs. arXiv:1606.08928
Podgorelec B, Turkanović M, Karakatič S (2019) A machine learning-based method for automated blockchain transaction signing including personalized anomaly detection. Sensors-Basel 20(1):147
Ramzan Z (2010) Phishing attacks and countermeasures. In: Handbook of information and communication security, pp 433–448
Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from URLs. Expert Syst Appl 117:345–357
Sharifi M, Siadati S H (2008) A phishing sites blacklist generator. In: 2008 IEEE/ACS international conference on computer systems and applications
Stojanović B, Božić J, Hofer-Schmitz K, Nahrgang K, Weber A, Badii A, Sundaram M, Jordan E, Runevic J (2021) Follow the trail: machine learning for fraud detection in Fintech applications. Sensors-Basel 21(5):1594
Van der Merwe A, Loock M, Dabrowski M (2005) Characteristics and responsibilities involved in a phishing attack. In: Proceedings of the 4th international symposium on information and communication technologies
Victor F, Lüders B K (2019) Measuring ethereum-based ERC20 token networks. In: International conference on financial cryptography and data security
Wang J, Chen P, Yu S, Xuan Q (2021) TSGN transaction subgraph networks for identifying Ethereum phishing accounts. arXiv:2104.08767
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’networks. Nature 393(6684):440–442
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Wolsing K, Roepert L, Bauer J, Wehrle K (2022) Anomaly detection in maritime ais tracks: a review of recent approaches. J Mar Sci Eng 10(1):112
Wu J, Yuan Q, Lin D, You W, Chen W, Chen C, Zheng Z (2020) Who are the phishers? Phishing scam detection on Ethereum via network embedding. IEEE Trans Syst Man Cybern Syst A. https://doi.org/10.1109/tsmc.2020.3016821
Yuan Q, Huang B, Zhang J, Wu J, Zhang H, Zhang X (2020) Detecting phishing scams on ethereum based on transaction records. In: IEEE international symposium on circuits and systems (ISCAS)
Zheng P, Zheng Z, Wu J, Dai H-N (2020) Xblock-ETH: Extracting and exploring blockchain data from Ethereum. IEEE Open J Comp Soc 1:95–106
Funding
The work described in this paper was supported by the National Natural Science Foundation of China (72025104), the Fundamental Research Funds for the Central Universities (JBK2103009), the fund of Financial Innovation Center, SWUFE and the achievements transformation projects of SWUFE Jiaozi Institute of Fintech Innovation.
Author information
Authors and Affiliations
Contributions
The initial idea and theoretical framework were first proposed by Yun Wan. Material preparation, data collection, analysis and experiment design were performed by Yun Wan, Dapeng Zhang and Feng Xiao. The first draft of the manuscript was written by Yun Wan and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wan, Y., Xiao, F. & Zhang, D. Early-stage phishing detection on the Ethereum transaction network. Soft Comput 27, 3707–3719 (2023). https://doi.org/10.1007/s00500-022-07661-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-07661-0
Keywords
- Early-stage
- Phishing detection
- Cryptocurrency
- Feature extraction