Skip to main content

Catching the Phish: Detecting Phishing Attacks Using Recurrent Neural Networks (RNNs)

  • Conference paper
  • First Online:
Information Security Applications (WISA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11897))

Included in the following conference series:

Abstract

The emergence of online services in our daily lives has been accompanied by a range of malicious attempts to trick individuals into performing undesired actions, often to the benefit of the adversary. The most popular medium of these attempts is phishing attacks, mainly through emails and websites. In order to defend against such attacks, there is an urgent need for automated mechanisms to identify this malicious content before it reaches users. Machine learning techniques have gradually become the standard for such classification problems. However, identifying common measurable features of phishing content (e.g., in emails) is notoriously difficult. To address this problem, we engage in a novel study into a phishing content classifier based on a recurrent neural network (RNN), which identifies such features without human input. At this stage, we scope our research to emails, but our approach can be extended to apply to websites. Our results show that the proposed system outperforms state-of-the-art tools. Furthermore, our classifier is efficient and takes into account only the text and, in particular, the textual structure of the email. Since these features are rarely considered in email classification, we argue that our classifier can complement existing classifiers with high information gain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahnsen, A.C., Bohorquez, E.C., Villegas, S., Vargas, J., González, F.A.: Classifying phishing URLs using recurrent neural networks. In: Proceedings of the APWG Symposium on Electronic Crime Research, eCrime 2017. IEEE, April 2017. https://doi.org/10.1109/ECRIME.2017.7945048

  2. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994). https://doi.org/10.1109/72.279181

    Article  Google Scholar 

  3. Bergholz, A., Chang, J.H., Paaß, G., Reichartz, F., Strobel, S.: Improved phishing detection using model-based features. In: Proceedings of the Fifth Conference on Email and Anti-Spam, CEAS 2008, August 2008

    Google Scholar 

  4. Bergholz, A., De Beer, J., Glahn, S., Moens, M.F., Paaß, G., Strobel, S.: New filtering approaches for phishing email. J. Comput. Secur. 18(1), 7–35 (2010). https://doi.org/10.3233/JCS-2010-0371. Special Issue on EU-funded ICT research on Trust and Security

    Article  Google Scholar 

  5. Chandrasekaran, M., Narayanan, K., Upadhyaya, S.: Phishing email detection based on structural properties. In: Proceedings of the 9th Annual NYS Cyber Security Conference, NYSCSC 2006, June 2006

    Google Scholar 

  6. Fette, I., Sadeh, N., Tomasic, A.: Learning to detect phishing emails. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 649–656. ACM, May 2007. https://doi.org/10.1145/1242572.1242660

  7. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000). https://doi.org/10.1109/72.279181

    Article  Google Scholar 

  8. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). https://www.deeplearningbook.org

  9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  10. Iuga, C., Nurse, J.R.C., Erola, A.: Baiting the hook: factors impacting susceptibility to phishing attacks. Hum.-Centric Comput. Inf. Sci. 6 (2016). https://doi.org/10.1186/s13673-016-0065-2

  11. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, pp. 2342–2350, July 2015

    Google Scholar 

  12. Khonji, M., Iraqi, Y., Jones, A.: Phishing detection: a literature survey. IEEE Commun. Surv. Tutor. 15(4), 2091–2121 (2013). https://doi.org/10.1109/SURV.2013.032213.00009

    Article  Google Scholar 

  13. Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. arXiv preprint (2014). https://arxiv.org/abs/1412.6980

  14. Mohammad, R.M., Thabtah, F., McCluskey, L.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25(2), 443–458 (2014). https://doi.org/10.1007/s00521-013-1490-z

    Article  Google Scholar 

  15. Nazario, J.: https://monkey.org/~jose/phishing/

  16. Nurse, J.R.C.: Cybercrime and you: how criminals attack and the human factors that they seek to exploit. In: The Oxford Handbook of Cyberpsychology. Oxford University Press, Oxford, May 2019. https://doi.org/10.1093/oxfordhb/9780198812746.013.35

  17. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. arXiv preprint (2012). https://arxiv.org/abs/1211.5063

  18. PhishMe Inc.: 2016 enterprise phishing susceptibility and resiliency report (2016)

    Google Scholar 

  19. Porter, M.F.: Snowball: a language for stemming algorithms. https://snowballstem.org/

  20. Ramzan, Z.: Phishing attacks and countermeasures. In: Stavroulakis, P., Stamp, M. (eds.) Handbook of Information and Communication Security, pp. 433–448. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-04117-4_23

    Chapter  Google Scholar 

  21. Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint (2013). https://arxiv.org/abs/1312.6120

  22. SpamAssassin. https://spamassassin.apache.org/old/publiccorpus/

  23. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  24. Toolan, F., Carthy, J.: Feature selection for spam and phishing detection. In: 2010 eCrime Researchers Summit, pp. 1–12. IEEE, October 2010. https://doi.org/10.1109/ecrime.2010.5706696

  25. Verma, R., Shashidhar, N., Hossain, N.: Detecting phishing emails the natural language way. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 824–841. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33167-1_47

    Chapter  Google Scholar 

  26. Vinayakumar, R., Soman, K.P., Poornachandran, P.: Evaluating deep learning approaches to characterize and classify malicious URLs. J. Intell. Fuzzy Syst. 34(3), 1333–1343 (2018). https://doi.org/10.3233/JIFS-169423

    Article  Google Scholar 

  27. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint (2014). https://arxiv.org/abs/arXiv:1409.2329

  28. Zhao, J., Wang, N., Ma, Q., Cheng, Z.: Classifying malicious URLs using gated recurrent neural networks. In: Barolli, L., Xhafa, F., Javaid, N., Enokido, T. (eds.) IMIS 2018. AISC, vol. 773, pp. 385–394. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-93554-6_36

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lukáš Halgaš .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Halgaš, L., Agrafiotis, I., Nurse, J.R.C. (2020). Catching the Phish: Detecting Phishing Attacks Using Recurrent Neural Networks (RNNs). In: You, I. (eds) Information Security Applications. WISA 2019. Lecture Notes in Computer Science(), vol 11897. Springer, Cham. https://doi.org/10.1007/978-3-030-39303-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39303-8_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39302-1

  • Online ISBN: 978-3-030-39303-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics