That Ain’t You: Blocking Spearphishing Through Behavioral Modelling

  • Gianluca Stringhini
  • Olivier Thonnard
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9148)


One of the ways in which attackers steal sensitive information from corporations is by sending spearphishing emails. A typical spearphishing email appears to be sent by one of the victim’s coworkers or business partners, but has instead been crafted by the attacker. A particularly insidious type of spearphishing emails are the ones that do not only claim to be written by a certain person, but are also sent by that person’s email account, which has been compromised. Spearphishing emails are very dangerous for companies, because they can be the starting point to a more sophisticated attack or cause intellectual property theft, and lead to high financial losses. Currently, there are no effective systems to protect users against such threats. Existing systems leverage adaptations of anti-spam techniques. However, these techniques are often inadequate to detect spearphishing attacks. The reason is that spearphishing has very different characteristics from spam and even traditional phishing. To fight the spearphishing threat, we propose a change of focus in the techniques that we use for detecting malicious emails: instead of looking for features that are indicative of attack emails, we look for emails that claim to have been written by a certain person within a company, but were actually authored by an attacker. We do this by modelling the email-sending behavior of users over time, and comparing any subsequent email sent by their accounts against this model. Our approach can block advanced email attacks that traditional protection systems are unable to detect, and is an important step towards detecting advanced spearphishing attacks.


Feature Vector Behavioral Profile Sequential Minimal Optimization Mail Server Email Account 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by a Symantec Research Labs Graduate Fellowship for the year 2012. We would like to thank the anonymous reviewers for their useful comments. We would also like to thank the people at Symantec, in particular Marc Dacier, David T. Lin, Dermot Harnett, Joe Krug, David Cawley, and Nick Johnston for their support and comments. We would also like to thank Adam Doupè and Ali Zand for reviewing an early version of this paper. Your feedback was very helpful.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Abbasi, A., Chen, H., Nunamaker, J.F.: Stylometric identification in electronic markets: scalability and robustness. J. Manage. Inform. Syst. 25, 49–78 (2008)CrossRefGoogle Scholar
  5. 5.
    Afroz, S., Brennan, M., Greenstadt, R.: Detecting hoaxes, frauds, and deception in writing style online. In: IEEE Symposium on Security and Privacy (2012)Google Scholar
  6. 6.
    Aloul, F., Zahidi, S., El-Hajj, W.: Two factor authentication using mobile phones. In: IEEE/ACS International Conference on Computer Systems and Applications (2009)Google Scholar
  7. 7.
    Calix, K., Connors, M., Levy, D., Manzar, H., MCabe, G., Westcott, S.: Stylometry for e-mail author identification and authentication. In: Proceedings of CSIS Research Day, Pace University (2008)Google Scholar
  8. 8.
    Corney, M.W.: Analysing E-mail Text Authorship for Forensic PurposesGoogle Scholar
  9. 9.
    Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Networks 10, 1048–1054 (1999)CrossRefGoogle Scholar
  10. 10.
    Egele, M., Stringhini, G., Kruegel, C., Vigna, G.: COMPA: detecting compromised social network accounts. In: Symposium on Network and Distributed System Security (NDSS) (2013)Google Scholar
  11. 11.
    Fette, I., Sadeh, N., Tomasic, A.: Learning to Detect Phishing EmailsGoogle Scholar
  12. 12.
    Forsyth, R., Holmes, D.: Feature finding for text classification. Literary Linguist. Comput. 11, 163–174 (1996)CrossRefGoogle Scholar
  13. 13.
    Frantzeskou, G., Stamatatos, E., Gritzalis, S., Chaski, C.E., Howald, B.S.: Identifying authorship by byte-level n-grams: the source code author profile (scap) method. Int. J. Digit. Evid. (2007)Google Scholar
  14. 14.
    Hao, S., Syed, N.A., Feamster, N., Gray, A.G., Krasser, S.: Detecting spammers with SNARE: spatio-temporal network-level automatic reputation engine. In: USENIX Security Symposium (2009)Google Scholar
  15. 15.
    Iqbal, F., Hadjidj, R., Fung, B., Debbabi, M.: A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit. Invest. 5, S42–S51 (2008)CrossRefGoogle Scholar
  16. 16.
    Jagatic, T.N., Johnson, N.A., Jakobsson, M., Menczer, F.: Social phishing. Commun. ACM 50, 94–100 (2007)CrossRefGoogle Scholar
  17. 17.
    John, J.P., Moshchuk, A., Gribble, S.D., Krishnamurthy, A.: Studying spamming botnets using botlab. In: USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2009)Google Scholar
  18. 18.
    Kakavelakis, G., Beverly, R., Young, J.: Auto-learning of SMTP TCP transport-layer features for spam and abusive message detection. In: USENIX Large Installation System Administration Conference (2011)Google Scholar
  19. 19.
    Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)Google Scholar
  20. 20.
    Leiba, B.: DomainKeys Identified Mail (DKIM): Using digital signatures for domain verification. In: CEAS (2007)Google Scholar
  21. 21.
    Lin, E., Aycock, J., Mannan, M.: Lightweight client-side methods for detecting email forgery. In: Lee, D.H., Yung, M. (eds.) WISA 2012. LNCS, vol. 7690, pp. 254–269. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  22. 22.
    Meyer, T., Whateley, B.: SpamBayes: effective open-source, Bayesian based, email classification system. In: CEAS (2004)Google Scholar
  23. 23.
    Narayanan, A., Paskov, H., Gong, N.Z., Bethencourt, J., Stefanov, E., Shin, E.C.R., Song, D.: On the feasibility of internet-scale author identification. In: IEEE Symposium on Security and Privacy (2012)Google Scholar
  24. 24.
    Pitsillidis, A., Levchenko, K., Kreibich, C., Kanich, C., Voelker, G.M., Paxson, V., Weaver, N., Savage, S.: Botnet Judo: fighting spam with itself. In: Symposium on Network and Distributed System Security (NDSS) (2010)Google Scholar
  25. 25.
    Platt, J., et al.: Sequential minimal optimization: a fast algorithm for training support vector machinesGoogle Scholar
  26. 26.
    Ramachandran, A., Feamster, N., Vempala, S.: Filtering spam with behavioral blacklisting. In: ACM Conference on Computer and Communications Security (CCS) (2007)Google Scholar
  27. 27.
    Sahami, M., Dumais, S., Heckermann, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization (1998)Google Scholar
  28. 28.
    Sculley, D., Wachman, G.M.: Relaxed online SVMs for spam filtering. In: ACM SIGIR Conference on Research and Development in Information Retrieval (2007)Google Scholar
  29. 29.
    Stolfo, S.J., Hershkop, S., Hu, C.-W., Li, W.-J., Nimeskern, O., Wang, K.: Behavior-based modeling and its application to email analysis. ACM Trans. Internet Technol. (TOIT) 6, 187–221 (2006)CrossRefGoogle Scholar
  30. 30.
    Stolfo, S.J., Hershkop, S., Wang, K., Nimeskern, O., Hu, C.-W.: Behavior profiling of email. In: Chen, H., Miranda, R., Zeng, D.D., Demchak, C.C., Schroeder, J., Madhusudan, T. (eds.) ISI 2003. LNCS, vol. 2665, pp. 74–90. Springer, Heidelberg (2003) CrossRefGoogle Scholar
  31. 31.
    Stringhini, G., Egele, M., Zarras, A., Holz, T., Kruegel, C., Vigna, G.: B@BEL: leveraging email delivery for spam mitigation. In: USENIX Security Symposium (2012)Google Scholar
  32. 32.
    Stringhini, G., Holz, T., Stone-Gross, B., Kruegel, C., Vigna, G.: BotMagnifier: locating spambots on the internet. In: USENIX Security Symposium (2011)Google Scholar
  33. 33.
    Stringhini, G., Thonnard, O.: That ain’t you: detecting spearphishing emails before they are sent. arXiv preprint arXiv:1410.6629 (2014)
  34. 34.
    Symantec Corp. Symantec intelligence report (2013).
  35. 35.
    Taylor, B.: Sender reputation in a large webmail service. In: CEAS (2006)Google Scholar
  36. 36.
  37. 37.
    Thonnard, O., Bilge, L., O’Gorman, G., Kiernan, S., Lee, M.: Industrial espionage and targeted attacks: understanding the characteristics of an escalating threat. In: Balzarotti, D., Stolfo, S.J., Cova, M. (eds.) RAID 2012. LNCS, vol. 7462, pp. 64–85. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  38. 38.
    Threatpost. New Email Worm Turns Back the Clock on Virus Attacks (2010).
  39. 39.
    Trend Micro Inc., Spear-Phishing Email: Most Favored APT Attack Bait (2012)Google Scholar
  40. 40.
    Tweedie, F., Baayern, R.: How variable may a constant be? Measures of lexical richness in perspective. Comput. Humanit. 32, 323–352 (1998)CrossRefGoogle Scholar
  41. 41.
    Venkataraman, S., Sen, S., Spatscheck, O., Haffner, P., Song, D.: Exploiting network structure for proactive spam mitigation. In: USENIX Security Symposium (2007)Google Scholar
  42. 42.
    Wong, M., Schlitt, W.: RFC 4408: Sender Policy Framework (SPF) for Authorizing Use of Domains in E-Mail, Version 1 (2006).
  43. 43.
    Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. SIGCOMM Comput. Commun. Rev. 38, 171–182 (2008)CrossRefGoogle Scholar
  44. 44.
    Yule, G.: The Statistical Study of Literary Vocabulary. Cambridge University Press, Cambridge (1944)Google Scholar
  45. 45.
    Zalewski, M.: p0f v3 (2012).
  46. 46.
    Zhang, Y., Hong, J.I., Cranor, L.F.: Cantina: A Content-based Approach to Detecting Phishing Web SitesGoogle Scholar
  47. 47.
    Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Am. Soc. Inform. Sci. Technol. 57, 378–393 (2005)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.University College LondonLondonUK
  2. 2.AmadeusSophia AntipolisFrance

Personalised recommendations