Abstract
Internet has become an essential component of our everyday social and financial activities. Nevertheless, internet users may be vulnerable to different types of web threats, which may cause financial damages, identity theft, loss of private information, brand reputation damage and loss of customer’s confidence in e-commerce and online banking. Phishing is considered as a form of web threats that is defined as the art of impersonating a website of an honest enterprise aiming to obtain confidential information such as usernames, passwords and social security number. So far, there is no single solution that can capture every phishing attack. In this article, we proposed an intelligent model for predicting phishing attacks based on artificial neural network particularly self-structuring neural networks. Phishing is a continuous problem where features significant in determining the type of web pages are constantly changing. Thus, we need to constantly improve the network structure in order to cope with these changes. Our model solves this problem by automating the process of structuring the network and shows high acceptance for noisy data, fault tolerance and high prediction accuracy. Several experiments were conducted in our research, and the number of epochs differs in each experiment. From the results, we find that all produced structures have high generalization ability.
Similar content being viewed by others
References
Liu J, Ye Y (2001) Introduction to E-commerce agents: marketplace solutions, security issues, and supply and demand. In: E-commerce agents, marketplace solutions, security issues, and supply and demand, London, UK
APWG, Aaron G, Manning R (2013) APWG phishing reports. APWG, 1 February 2013. [Online]. Available: http://www.antiphishing.org/resources/apwg-reports/. Accessed 8 Feb 2013
Kaspersky Lab (2013) Spam in January 2012: love, politics and sport. [Online]. Available: http://www.kaspersky.com/about/news/spam/2012/Spam_in_January_2012_Love_Politics_and_Sport. Accessed 11 Feb 2013
Seogod (2011) Black Hat SEO. SEO Tools. [Online]. Available: http://www.seobesttools.com/black-hat-seo/. Accessed 8 Jan 2013
Dhamija R, Tygar JD, Hearst M (2006) Why phishing works. In: Proceedings of the SIGCHI conference on human factors in computing systems, Cosmopolitan Montréal, Canada
Cranor LF (2008) A framework for reasoning about the human in the loop. In: UPSEC’08 Proceedings of the 1st conference on usability, psychology, and security, Berkeley, CA, USA
Miyamoto D, Hazeyama H, Kadobayashi Y (2008) An evaluation of machine learning-based methods for detection of phishing sites. Aust J Intell Inf Process Syst 10(2):54–63
Xiang G, Hong J, Rose CP, Cranor L (2011) CANTINA+: a feature-rich machine learning framework for detecting phishing web sites. ACM Trans Inf Syst Secur 14(2):1–28
Witten IH, Frank E (2002) Data mining: practical machine learning tools and techniques with Java implementations. ACM, New York, NY
Zhang Y, Hong J, Cranor L (2007) CANTINA: a content-based approach to detect phishing web sites. In: Proceedings of the 16th world wide web conference, Banff, Alberta, Canada
Widrow B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation. In: Proceedings of the IEEE, vol 78, no 9, pp 1415–1442
Basheer I, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods 43(1):3–31
Aburrous M, Hossain MA, Dahal K, Fadi T (2010) Predicting phishing websites using classification mining techniques. In: Seventh international conference on information technology, Las Vegas, Nevada, USA
Thabtah F, Peter C, Peng Y (2005) MCAR: multi-class classification based on association rule. In: The 3rd ACS/IEEE international conference on computer systems and applications
Hu K, Lu Y, Zhou L, Shi C (1998) Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining (KDD-98, plenary presentation), New York, USA
Quinlan JR (1996) Improved use of continuous attributes in c4.5. J Artif Intell Res 4:77–90
Cendrowska J (1987) PRISM: an algorithm for inducing modular rule. Int J Man-Mach Stud 27(4):349–370
Aburrous M, Hossain MA, Dahal K, Thabtah F (2010) Intelligent phishing detection system for e-banking using fuzzy data mining. Expert Syst Appl Int J 37(12):7913–7921
Sodiya AS, Onashoga SA, Oladunjoye BA (2007) Threat modeling using fuzzy logic paradigm. In: Issues in Informing Science and Information Technology, vol 4
Pan Y, Ding X (2006) Anomaly based web phishing page detection. In: ACSAC ‘06: Proceedings of the 22nd annual computer security applications conference, Washington, DC
“W3C” [Online]. Available: http://www.w3.org/TR/DOM-Level-2-HTML/. Accessed Dec 2011
Cortes C, Vapnik V (1995) Support-vector networks. Machine Learning 20(3):273–297
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Sanglerdsinlapachai N, Rungsawang A (2010) Using domain top-page similarity feature in machine learning-based web. In: Third international conference on knowledge discovery and data mining, Washington, DC
Sadeh N, Tomasic A, Fette I (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on World Wide Web, pp 649–656
T. A. S. Project, “SpamAssassin” [Online]. Available: http://spamassassin.apache.org/. Accessed Jan 2012
Wenyin L, Huang G, Xiaoyue L, Min Z, Deng X (2005) Detection of phishing webpages based on visual similarity. In: Proceeding WWW ‘05 Special interest tracks and posters of the 14th international conference on World Wide Web, New York, NY
Dhamija R, Tygar JD (2005) The battle against phishing: dynamic security skins. In: Proceedings of the 1st symposium on usable privacy and security, New York, NY
Horng S-J, Fan P, Khan MK, Run R-S, Lai J-L, Chen R-J, Sutanto A, Mingxing H (2011) An efficient phishing webpage detector. Expert Syst Appl Int J 38(10):12018–12027
Mohammad RM, Thabtah F, McCluskey L (2012) An assessment of features related to phishing websites using an automated technique. In: The 7th international conference for internet technology and secured transactions (ICITST-2012), London
“WhoIS” [Online]. Available: http://who.is/. Accessed Dec 2011
Mohammad RM Phishing websites dataset. December 2012. [Online]. Available: http://phishingdatasets.wikispaces.com/. Accessed Dec 2012
“Yahoo Directory” [Online]. Available: http://dir.yahoo.com/. Accessed Dec 2011
“Starting Point Directory” [Online]. Available: http://www.stpt.com/directory/. Accessed Jan 2012
Liu W, Deng X, Huang G, Fu AY (2006) An antiphishing strategy based on visual similarity assessment. In: IEEE educational activities Department Piscataway, NJ, USA
“MillerSmiles” [Online]. Available: http://www.millersmiles.co.uk/
Nabhan TM, Zomaya AY (1994) Toward generating neural network structures for function approximation. Neural Netw 7(1):89–99
Hutchins RG (1995) Neural network topologies and training algorithms in nonlinear system identification. In: Systems, man and cybernetics. IEEE international conference on intelligent systems for the 21st century, Monterey, CA
Jacek ZM (1994) Introduction to artificial neural systems. Jaico Publishing House, India
Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms, 2nd edn. Wiley, USA
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mohammad, R.M., Thabtah, F. & McCluskey, L. Predicting phishing websites based on self-structuring neural network. Neural Comput & Applic 25, 443–458 (2014). https://doi.org/10.1007/s00521-013-1490-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-013-1490-z