Skip to main content

Web Scams Detection System

  • Conference paper
  • First Online:
Foundations and Practice of Security (FPS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14551))

Included in the following conference series:

  • 58 Accesses

Abstract

Web-based scams rely on scam websites to provide fraudulent business or fake services to steal money and sensitive information from unsuspecting victims. Despite many researchers’ efforts to develop anti-scam detection techniques, their main focus has been on understanding, detecting, and analyzing scam sites. State-of-the-art anti-scam research still faces several challenges, such as acquiring a properly labeled scam dataset, especially when there is no blacklist, central repository, or previous large-scale analysis. The researchers have created labeled datasets in different ways, such as manually collecting and labeling the dataset or using a semi-automatic crawler followed by manual inspection. However, this process requires previous knowledge and understanding of the scam and much manual work.

In this paper, we propose a data-driven model to create a labeled training dataset for web-based scams that have a web presence. Given a small scam sample, our model formulates scam-related search queries and uses them on multiple search engines to search for, and collect, potential scam pages. After collecting a sufficiently large corpus of web pages, our model semi-automatically clusters the search results and creates a labeled training dataset with minimal human interaction. We have validated our model using two different scam types that we have studied in our previous work. We tested our classifiers against the databases of web pages we collected during our previous analysis of the scams and successfully detected more than 87% of the scam pages while maintaining a false positive value as low as 0.23%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.scamwatch.gov.au/scam-statistics.

  2. 2.

    We describe applying the model to create and validate the training datasets for identifying BGS and GHS scams in a separate report hosted with our dataset https://bit.ly/DatasetPaper.

  3. 3.

    https://trends.google.com/trends/?geo=US.

  4. 4.

    https://urlscan.io/.

  5. 5.

    https://website.informer.com/.

  6. 6.

    https://www.cutestat.com/.

  7. 7.

    https://web.archive.org/.

  8. 8.

    https://www.alexa.com/.

  9. 9.

    http://chromedriver.chromium.org/.

  10. 10.

    https://selenium-python.readthedocs.io.

  11. 11.

    https://pypi.org/project/beautifulsoup4/.

  12. 12.

    We attached the dataset creation and validation process on a separate report hosted with our dataset https://bit.ly/DatasetPaperReport.

  13. 13.

    We collected the domains by crawling cutestat.com search engine and a blacklist maintained by Bitcoin.fr.

  14. 14.

    In our analysis, for both the manual and automated approaches, we did not include any automated process, such as crawling time. We only included the time we spent manually searching, inspecting, and labeling the pages.

  15. 15.

    https://websitesetup.org/news/internet-facts-stats/, accessed in 2022.

References

  1. Abhishta, A., Joosten, R., Dragomiretskiy, S., Nieuwenhuis, L.J.: Impact of successful ddos attacks on a major crypto-currency exchange. In: 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 379–384. IEEE (2019)

    Google Scholar 

  2. Afandi, N.A., Hamid, I.R.A.: Covid-19 phishing detection based on hyperlink using k-nearest neighbor (knn) algorithm. Appli. Inform. Technol. Comput. Sci. 2(2), 287–301 (2021)

    Google Scholar 

  3. Alarab, I., Prakoonwit, S., Nacer, M.I.: Comparative analysis using supervised learning methods for anti-money laundering in bitcoin. In: Proceedings of the 2020 5th International Conference on Machine Learning Technologies, pp. 11–17 (2020)

    Google Scholar 

  4. Alarab, I., Prakoonwit, S., Nacer, M.I.: Competence of graph convolutional networks for anti-money laundering in bitcoin blockchain. In: Proceedings of the 2020 5th International Conference on Machine Learning Technologies, pp. 23–27 (2020)

    Google Scholar 

  5. ARSLAN, A.: On the usefulness of html meta elements for web retrieval. Eskişehir Tech. Univ. . Sci. Technol. A-Appl. Sci. Eng. 21(1), 182–198 (2020)

    Google Scholar 

  6. Badawi, E., Jourdan, G.V., Bochmann, G., Onut, I.V.: An automatic detection and analysis of the bitcoin generator scam. In: 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS &PW), pp. 407–416. IEEE Computer Society, Los Alamitos, CA, USA (sep 2020)

    Google Scholar 

  7. Badawi, E., Jourdan, G.V., Bochmann, G., Onut, I.V.: Automatic detection and analysis of the “Game Hack” Scam. J. Web Eng. 18(8) (2020)

    Google Scholar 

  8. Badawi, E., Jourdan, G.-V., Bochmann, G., Onut, I.-V., Flood, J.: The “Game Hack’’ scam. In: Bakaev, M., Frasincar, F., Ko, I.-Y. (eds.) ICWE 2019. LNCS, vol. 11496, pp. 280–295. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19274-7_21

    Chapter  Google Scholar 

  9. Bartoletti, M., Carta, S., Cimoli, T., Saia, R.: Dissecting ponzi schemes on ethereum: identification, analysis, and impact. Futur. Gener. Comput. Syst. 102, 259–277 (2020)

    Article  Google Scholar 

  10. Bartoletti, M., Lande, S., Loddo, A., Pompianu, L., Serusi, S.: Cryptocurrency scams: analysis and perspectives. IEEE Access 9, 148353–148373 (2021)

    Article  Google Scholar 

  11. Bartoletti, M., Pes, B., Serusi, S.: Data mining for detecting bitcoin ponzi schemes. In: 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pp. 75–84. IEEE (2018)

    Google Scholar 

  12. Bidgoli, M., Grossklags, J.: ”hello. this is the irs calling.”: a case study on scams, extortion, impersonation, and phone spoofing. In: Electronic Crime Research (eCrime), 2017 APWG Symposium on, pp. 57–69. IEEE (2017)

    Google Scholar 

  13. Bistarelli, S., Parroccini, M., Santini, F.: Visualizing bitcoin flows of ransomware: Wannacry one week later, In: ITASEC (2018)

    Google Scholar 

  14. Bouma-Sims, E., Reaves, B.: A first look at scams on youtube. arXiv preprint arXiv:2104.06515 (2021)

  15. Buchanan, T., Whitty, M.T.: The online dating romance scam: causes and consequences of victimhood. Psychol. Crime Law 20(3), 261–283 (2014)

    Article  Google Scholar 

  16. Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: Proceedings of the 20th international conference on World wide web, pp. 197–206 (2011)

    Google Scholar 

  17. Charan, A.N.S., Chen, Y.H., Chen, J.L.: Phishing websites detection using machine learning with url analysis. In: 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), pp. 808–812 (2022)

    Google Scholar 

  18. Chen, W., Xu, Y., Zheng, Z., Zhou, Y., Yang, J.E., Bian, J.: Detecting" pump & dump schemes" on cryptocurrency market using an improved apriori algorithm. In: 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE), pp. 293–2935. IEEE (2019)

    Google Scholar 

  19. Chen, W., Zheng, Z., Cui, J., Ngai, E., Zheng, P., Zhou, Y.: Detecting ponzi schemes on ethereum: towards healthier blockchain technology. In: Proceedings of the 2018 World Wide Web Conference, pp. 1409–1418 (2018)

    Google Scholar 

  20. Chen, W., Zheng, Z., Ngai, E.C.H., Zheng, P., Zhou, Y.: Exploiting blockchain data to detect smart ponzi schemes on ethereum. IEEE Access 7, 37575–37586 (2019)

    Article  Google Scholar 

  21. Clark, J.W., McCoy, D.: There are no free ipads: an analysis of survey scams as a business. In: Presented as part of the 6th USENIX Workshop on Large-Scale Exploits and Emergent Threats. USENIX, Washington, D.C. (2013)

    Google Scholar 

  22. Conti, M., Gangwal, A., Ruj, S.: On the economic significance of ransomware campaigns: a bitcoin transactions perspective. Comput. Sec. 79, 162–189 (2018)

    Article  Google Scholar 

  23. Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of the 19th International Conference on World Wide Web, pp. 281–290 (2010)

    Google Scholar 

  24. Crawford, J., Guan, Y.: Knowing your bitcoin customer: money laundering in the bitcoin economy. In: 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE), pp. 38–45. IEEE (2020)

    Google Scholar 

  25. Custers, B., Oerlemans, J.J., Pool, R.: Laundering the profits of ransomware: money laundering methods for vouchers and cryptocurrencies. Euro. J. Crime Criminal Law Criminal Justice 28(2), 121–152 (2020)

    Article  Google Scholar 

  26. Dashevskyi, S., Zhauniarovich, Y., Gadyatskaya, O., Pilgun, A., Ouhssain, H.: Dissecting android cryptocurrency miners. In: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, pp. 191–202 (2020)

    Google Scholar 

  27. Farrugia, S., Ellul, J., Azzopardi, G.: Detection of illicit accounts over the ethereum blockchain. Expert Syst. Appl. 150, 113318 (2020)

    Article  Google Scholar 

  28. Gopal, R.D., Hojati, A., Patterson, R.A.: Analysis of third-party request structures to detect fraudulent websites. Decis. Support Syst. 154, 113698 (2022)

    Article  Google Scholar 

  29. Goyal, P.S., Kakkar, A., Vinod, G., Joseph, G.: Crypto-ransomware detection using behavioural analysis. In: Varde, P.V., Prakash, R.V., Vinod, G. (eds.) Reliability, Safety and Hazard Assessment for Risk-Based Technologies. LNME, pp. 239–251. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-9008-1_20

    Chapter  Google Scholar 

  30. Harley, D., Grooten, M., Burn, S., Johnston, C.: My pc has 32,539 errors: how telephone support scams really work. Virus Bulletin (2012)

    Google Scholar 

  31. Hong, G., et al.: Analyzing ground-truth data of mobile gambling scams. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 2176–2193. IEEE (2022)

    Google Scholar 

  32. Invernizzi, L., Comparetti, P.M., Benvenuti, S., Kruegel, C., Cova, M., Vigna, G.: Evilseed: a guided approach to finding malicious web pages. In: 2012 IEEE symposium on Security and Privacy, pp. 428–442. IEEE (2012)

    Google Scholar 

  33. Jung, E., Le Tilly, M., Gehani, A., Ge, Y.: Data mining-based ethereum fraud detection. In: 2019 IEEE International Conference on Blockchain (Blockchain), pp. 266–273. IEEE (2019)

    Google Scholar 

  34. Kamps, J., Kleinberg, B.: To the moon: defining and detecting cryptocurrency pump-and-dumps. Crime Sci. 7(1), 18 (2018)

    Article  Google Scholar 

  35. Karhade, A., Yogi, A., Gupta, A., Landge, P., Galphade, M.: CNN for detection of COVID-19 using chest x-ray images. In: Verma, P., Charan, C., Fernando, X., Ganesan, S. (eds.) Advances in Data Computing, Communication and Security. LNDECT, vol. 106, pp. 251–259. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-8403-6_22

    Chapter  Google Scholar 

  36. Kharraz, A., et al.: Outguard: detecting in-browser covert cryptocurrency mining in the wild. In: The World Wide Web Conference, pp. 840–852 (2019)

    Google Scholar 

  37. Kharraz, A., Robertson, W., Kirda, E.: Surveylance: automatically detecting online survey scams. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 70–86. IEEE (2018)

    Google Scholar 

  38. Kikerpill, K., Siibak, A.: Mazephishing: the covid-19 pandemic as credible social context for social engineering attacks. Trames: J. Humanities Soc. Sci. 25(4), 371–393 (2021)

    Google Scholar 

  39. Kumar, N., Singh, A., Handa, A., Shukla, S.K.: Detecting malicious accounts on the ethereum blockchain with supervised learning. In: Dolev, S., Kolesnikov, V., Lodha, S., Weiss, G. (eds.) CSCML 2020. LNCS, vol. 12161, pp. 94–109. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49785-9_7

    Chapter  Google Scholar 

  40. Liao, K., Zhao, Z., Doupé, A., Ahn, G.J.: Behind closed doors: measurement and analysis of cryptolocker ransoms in bitcoin. In: 2016 APWG eCrime, pp. 1–13. IEEE (2016)

    Google Scholar 

  41. Miramirkhani, N., Starov, O., Nikiforakis, N.: Dial one for scam: a large-scale analysis of technical support scams. arXiv preprint arXiv:1607.06891 (2016)

  42. Modic, D., Anderson, R.: It’s all over but the crying: the emotional and financial impact of internet fraud. IEEE Sec. Priv. 13(5), 99–103 (2015)

    Article  Google Scholar 

  43. Mohan, K.J., Poojitha, P.A., Reddy, V.A., Ajay, Y., Vardhan, T.H.: Prediction and analysis of crime rate for tourists by using data mining 13(2), 1–12 (2022)

    Google Scholar 

  44. Moore, T., Clayton, R.: Evil Searching: compromise and recompromise of internet hosts for phishing. In: Dingledine, R., Golle, P. (eds.) FC 2009. LNCS, vol. 5628, pp. 256–272. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03549-4_16

    Chapter  Google Scholar 

  45. Musch, M., Wressnegger, C., Johns, M., Rieck, K.: Thieves in the browser: web-based cryptojacking in the wild. In: Proceedings of the 14th International Conference on Availability, Reliability and Security, pp. 1–10 (2019)

    Google Scholar 

  46. Phillips, R., Wilder, H.: Tracing cryptocurrency scams: clustering replicated advance-fee and phishing websites. arXiv preprint arXiv:2005.14440 (2020)

  47. Ravenelle, A.J., Janko, E., Kowalski, K.C.: Good jobs, scam jobs: detecting, normalizing, and internalizing online job scams during the covid-19 pandemic. New Media Soc. 24(7), 1591–1610 (2022)

    Google Scholar 

  48. Razali, M.A., Mohd Shariff, S.: CMBlock: in-browser detection and prevention cryptojacking tool using blacklist and behavior-based detection method. In: Badioze Zaman, H., et al. (eds.) IVIC 2019. LNCS, vol. 11870, pp. 404–414. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34032-2_36

    Chapter  Google Scholar 

  49. Sadi, S.H., Pk, M.R.H., Zeki, A.M.: Threat detector for social media using text analysis. Inter. J. Perceptive Cognit. Comput. 7(1), 113–117 (2021)

    Google Scholar 

  50. Sahin, M., Relieu, M., Francillon, A.: Using chatbots against voice spam: Analyzing lenny’s effectiveness. In: Thirteenth Symposium on Usable Privacy and Security (SOUPS 2017), pp. 319–337. USENIX Association, Santa Clara, CA (2017)

    Google Scholar 

  51. Samarasinghe, N., Mannan, M.: On cloaking behaviors of malicious websites. Comput. Sec. 101, 102114 (2021)

    Article  Google Scholar 

  52. SatheeshKumar, M., Srinivasagan, K., UnniKrishnan, G.: A lightweight and proactive rule-based incremental construction approach to detect phishing scam. Inform. Technol. Manag., 1–28 (2022)

    Google Scholar 

  53. Shaari, A.H., Kamaluddin, M.R., Paizi, W.F., Mohd, M., et al.: Online-dating romance scam in malaysia: An analysis of online conversations between scammers and victims. GEMA Online® J. Lang. Stud. 19(1) (2019)

    Google Scholar 

  54. Shalke, C.J., Achary, R.: Social engineering attack and scam detection using advanced natural langugae processing algorithm. In: 6th International Conference on Trends in Electronics and Informatics, pp. 1749–1754. IEEE (2022)

    Google Scholar 

  55. Sherman, I.N., Bowers, J., McNamara Jr, K., Gilbert, J.E., Ruiz, J., Traynor, P.: Are you going to answer that? measuring user responses to anti-robocall application indicators. In: NDSS (2020)

    Google Scholar 

  56. Srinivasan, B., Kountouras, A., Miramirkhani, N., Alam, M., Nikiforakis, N., Antonakakis, M., Ahamad, M.: Exposing search and advertisement abuse tactics and infrastructure of technical support scammers. In: WWW 2018, pp. 319–328 (2018)

    Google Scholar 

  57. Starov, O., Zhou, Y., Wang, J.: Detecting malicious campaigns in obfuscated javascript with scalable behavioral analysis. In: 2019 IEEE Security and Privacy Workshops (SPW), pp. 218–223. IEEE (2019)

    Google Scholar 

  58. Tanana, D.: Behavior-based detection of cryptojacking malware. In: 2020 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), pp. 0543–0545. IEEE (2020)

    Google Scholar 

  59. Tashtoush, Y., Alrababah, B., Darwish, O., Maabreh, M., Alsaedi, N.: A deep learning framework for detection of covid-19 fake news on social media platforms. Data 7(5), 65 (2022)

    Article  Google Scholar 

  60. Torres, C.F., Baden, M., State, R.: Towards usable protection against honeypots. In: 2020 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1–2. IEEE (2020)

    Google Scholar 

  61. Toyoda, K., Mathiopoulos, P.T., Ohtsuki, T.: A novel methodology for hyip operators’ bitcoin addresses identification. IEEE Access 7, 74835–74848 (2019)

    Article  Google Scholar 

  62. Toyoda, K., Ohtsuki, T., Mathiopoulos, P.: Time series analysis for bitcoin transactions: the case of pirate@ 40’s hyip scheme. In: IEEE ICDMW 2018, pp. 151–155. IEEE (2018)

    Google Scholar 

  63. Toyoda, K., Ohtsuki, T., Mathiopoulos, P.T.: Identification of high yielding investment programs in bitcoin via transactions pattern analysis. In: GLOBECOM 2017, pp. 1–6. IEEE (2017)

    Google Scholar 

  64. Toyoda, K., Ohtsuki, T., Mathiopoulos, P.T.: Multi-class bitcoin-enabled service identification based on transaction history summarization. In: iThings/ GreenCom/ CPSCom/ SmartData/ Blockchain/ CIT/Cybermatics 2018, pp. 1153–1160. IEEE (2018)

    Google Scholar 

  65. Tripathi, A., Ghosh, M., Bharti, K.: Analyzing the uncharted territory of monetizing scam videos on youtube. Soc. Netw. Anal. Min. 12(1), 1–18 (2022)

    Article  Google Scholar 

  66. Tu, H., Doupé, A., Zhao, Z., Ahn, G.J.: Users really do answer telephone scams. In: 28th \(\{\)USENIX\(\}\) Security Symposium, pp. 1327–1340 (2019)

    Google Scholar 

  67. Ueno, D., et al.: Mild cognitive decline is a risk factor for scam vulnerability in older adults. Front. Psychiatry, 2365 (2021)

    Google Scholar 

  68. Vasek, M., Moore, T.: There’s no free lunch, even using bitcoin: tracking the popularity and profits of virtual currency scams. In: Böhme, R., Okamoto, T. (eds.) FC 2015. LNCS, vol. 8975, pp. 44–61. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47854-7_4

  69. Vasek, M., Moore, T.: Analyzing the bitcoin ponzi scheme ecosystem. In: Zohar, A., Eyal, I., Teague, V., Clark, J., Bracciali, A., Pintore, F., Sala, M. (eds.) FC 2018. LNCS, vol. 10958, pp. 101–112. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-58820-8_8

  70. Vasek, M., Thornton, M., Moore, T.: Empirical analysis of denial-of-service attacks in the bitcoin ecosystem. In: Böhme, R., Brenner, M., Moore, T., Smith, M. (eds.) FC 2014. LNCS, vol. 8438, pp. 57–71. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44774-1_5

  71. Victor, F., Hagemann, T.: Cryptocurrency pump and dump schemes: Quantification and detection. In: 2019 International Conference on Data Mining Workshops (ICDMW), pp. 244–251. IEEE (2019)

    Google Scholar 

  72. Whitty, M.T.: Anatomy of the online dating romance scam. Secur. J. 28(4), 443–455 (2015)

    Article  MathSciNet  Google Scholar 

  73. Whitty, M.T.: Do you love me? psychological characteristics of romance scam victims. Cyberpsychol. Behav. Soc. Netw. 21(2), 105–109 (2018)

    Article  Google Scholar 

  74. Xu, J., Livshits, B.: The anatomy of a cryptocurrency pump-and-dump scheme. In: 28th \(\{\)USENIX\(\}\) Security Symposium, pp. 1609–1625 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emad Badawi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Badawi, E., Jourdan, GV., Onut, IV. (2024). Web Scams Detection System. In: Mosbah, M., Sèdes, F., Tawbi, N., Ahmed, T., Boulahia-Cuppens, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2023. Lecture Notes in Computer Science, vol 14551. Springer, Cham. https://doi.org/10.1007/978-3-031-57537-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57537-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57536-5

  • Online ISBN: 978-3-031-57537-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics