Skip to main content
Log in

Predicting phishing websites based on self-structuring neural network

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Internet has become an essential component of our everyday social and financial activities. Nevertheless, internet users may be vulnerable to different types of web threats, which may cause financial damages, identity theft, loss of private information, brand reputation damage and loss of customer’s confidence in e-commerce and online banking. Phishing is considered as a form of web threats that is defined as the art of impersonating a website of an honest enterprise aiming to obtain confidential information such as usernames, passwords and social security number. So far, there is no single solution that can capture every phishing attack. In this article, we proposed an intelligent model for predicting phishing attacks based on artificial neural network particularly self-structuring neural networks. Phishing is a continuous problem where features significant in determining the type of web pages are constantly changing. Thus, we need to constantly improve the network structure in order to cope with these changes. Our model solves this problem by automating the process of structuring the network and shows high acceptance for noisy data, fault tolerance and high prediction accuracy. Several experiments were conducted in our research, and the number of epochs differs in each experiment. From the results, we find that all produced structures have high generalization ability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Liu J, Ye Y (2001) Introduction to E-commerce agents: marketplace solutions, security issues, and supply and demand. In: E-commerce agents, marketplace solutions, security issues, and supply and demand, London, UK

  2. APWG, Aaron G, Manning R (2013) APWG phishing reports. APWG, 1 February 2013. [Online]. Available: http://www.antiphishing.org/resources/apwg-reports/. Accessed 8 Feb 2013

  3. Kaspersky Lab (2013) Spam in January 2012: love, politics and sport. [Online]. Available: http://www.kaspersky.com/about/news/spam/2012/Spam_in_January_2012_Love_Politics_and_Sport. Accessed 11 Feb 2013

  4. Seogod (2011) Black Hat SEO. SEO Tools. [Online]. Available: http://www.seobesttools.com/black-hat-seo/. Accessed 8 Jan 2013

  5. Dhamija R, Tygar JD, Hearst M (2006) Why phishing works. In: Proceedings of the SIGCHI conference on human factors in computing systems, Cosmopolitan Montréal, Canada

  6. Cranor LF (2008) A framework for reasoning about the human in the loop. In: UPSEC’08 Proceedings of the 1st conference on usability, psychology, and security, Berkeley, CA, USA

  7. Miyamoto D, Hazeyama H, Kadobayashi Y (2008) An evaluation of machine learning-based methods for detection of phishing sites. Aust J Intell Inf Process Syst 10(2):54–63

    Google Scholar 

  8. Xiang G, Hong J, Rose CP, Cranor L (2011) CANTINA+: a feature-rich machine learning framework for detecting phishing web sites. ACM Trans Inf Syst Secur 14(2):1–28

    Google Scholar 

  9. Witten IH, Frank E (2002) Data mining: practical machine learning tools and techniques with Java implementations. ACM, New York, NY

    Google Scholar 

  10. Zhang Y, Hong J, Cranor L (2007) CANTINA: a content-based approach to detect phishing web sites. In: Proceedings of the 16th world wide web conference, Banff, Alberta, Canada

  11. Widrow B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation. In: Proceedings of the IEEE, vol 78, no 9, pp 1415–1442

    Article  Google Scholar 

  12. Basheer I, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods 43(1):3–31

    Google Scholar 

  13. Aburrous M, Hossain MA, Dahal K, Fadi T (2010) Predicting phishing websites using classification mining techniques. In: Seventh international conference on information technology, Las Vegas, Nevada, USA

  14. Thabtah F, Peter C, Peng Y (2005) MCAR: multi-class classification based on association rule. In: The 3rd ACS/IEEE international conference on computer systems and applications

  15. Hu K, Lu Y, Zhou L, Shi C (1998) Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining (KDD-98, plenary presentation), New York, USA

  16. Quinlan JR (1996) Improved use of continuous attributes in c4.5. J Artif Intell Res 4:77–90

    MATH  Google Scholar 

  17. Cendrowska J (1987) PRISM: an algorithm for inducing modular rule. Int J Man-Mach Stud 27(4):349–370

    Article  MATH  Google Scholar 

  18. Aburrous M, Hossain MA, Dahal K, Thabtah F (2010) Intelligent phishing detection system for e-banking using fuzzy data mining. Expert Syst Appl Int J 37(12):7913–7921

    Article  Google Scholar 

  19. Sodiya AS, Onashoga SA, Oladunjoye BA (2007) Threat modeling using fuzzy logic paradigm. In: Issues in Informing Science and Information Technology, vol 4

  20. Pan Y, Ding X (2006) Anomaly based web phishing page detection. In: ACSAC ‘06: Proceedings of the 22nd annual computer security applications conference, Washington, DC

  21. “W3C” [Online]. Available: http://www.w3.org/TR/DOM-Level-2-HTML/. Accessed Dec 2011

  22. Cortes C, Vapnik V (1995) Support-vector networks. Machine Learning 20(3):273–297

    MATH  Google Scholar 

  23. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  24. Sanglerdsinlapachai N, Rungsawang A (2010) Using domain top-page similarity feature in machine learning-based web. In: Third international conference on knowledge discovery and data mining, Washington, DC

  25. Sadeh N, Tomasic A, Fette I (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on World Wide Web, pp 649–656

  26. T. A. S. Project, “SpamAssassin” [Online]. Available: http://spamassassin.apache.org/. Accessed Jan 2012

  27. Wenyin L, Huang G, Xiaoyue L, Min Z, Deng X (2005) Detection of phishing webpages based on visual similarity. In: Proceeding WWW ‘05 Special interest tracks and posters of the 14th international conference on World Wide Web, New York, NY

  28. Dhamija R, Tygar JD (2005) The battle against phishing: dynamic security skins. In: Proceedings of the 1st symposium on usable privacy and security, New York, NY

  29. Horng S-J, Fan P, Khan MK, Run R-S, Lai J-L, Chen R-J, Sutanto A, Mingxing H (2011) An efficient phishing webpage detector. Expert Syst Appl Int J 38(10):12018–12027

    Article  Google Scholar 

  30. Mohammad RM, Thabtah F, McCluskey L (2012) An assessment of features related to phishing websites using an automated technique. In: The 7th international conference for internet technology and secured transactions (ICITST-2012), London

  31. “WhoIS” [Online]. Available: http://who.is/. Accessed Dec 2011

  32. Mohammad RM Phishing websites dataset. December 2012. [Online]. Available: http://phishingdatasets.wikispaces.com/. Accessed Dec 2012

  33. “Yahoo Directory” [Online]. Available: http://dir.yahoo.com/. Accessed Dec 2011

  34. “Starting Point Directory” [Online]. Available: http://www.stpt.com/directory/. Accessed Jan 2012

  35. Liu W, Deng X, Huang G, Fu AY (2006) An antiphishing strategy based on visual similarity assessment. In: IEEE educational activities Department Piscataway, NJ, USA

  36. “MillerSmiles” [Online]. Available: http://www.millersmiles.co.uk/

  37. Nabhan TM, Zomaya AY (1994) Toward generating neural network structures for function approximation. Neural Netw 7(1):89–99

    Article  Google Scholar 

  38. Hutchins RG (1995) Neural network topologies and training algorithms in nonlinear system identification. In: Systems, man and cybernetics. IEEE international conference on intelligent systems for the 21st century, Monterey, CA

  39. Jacek ZM (1994) Introduction to artificial neural systems. Jaico Publishing House, India

    Google Scholar 

  40. Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms, 2nd edn. Wiley, USA

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rami M. Mohammad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohammad, R.M., Thabtah, F. & McCluskey, L. Predicting phishing websites based on self-structuring neural network. Neural Comput & Applic 25, 443–458 (2014). https://doi.org/10.1007/s00521-013-1490-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-013-1490-z

Keywords

Navigation