Detecting Spammers via Aggregated Historical Data Set

  • Eitan Menahem
  • Rami Pusiz
  • Yuval Elovici
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7645)


In this work we propose a new sender reputation mechanism that is based on an aggregated historical dataset, which encodes the behavior of mail transfer agents over exponential growing time windows. The proposed mechanism is targeted mainly at large enterprises and email service providers and can be used for updating both the black and the white lists. We evaluate the proposed mechanism using 9.5M anonymized log entries obtained from the biggest Internet service provider in Europe. Experiments show that proposed method detects more than 94% of the Spam emails that escaped the blacklist (i.e., TPR), while having less than 0.5% false-alarms. Therefore, the effectiveness of the proposed method is much higher than of previously reported reputation mechanisms, which rely on emails logs. In addition, on our data-set the proposed method eliminated the need in automatic content inspection of 4 out of 5 incoming emails, which resulted in dramatic reduction in the filtering computational load.


Aggregate Feature Spam Email Black List Address Error History Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alperovitch, D., Judge, P., Krasser, S.: Taxonomy of email reputation systems. In: ICDCS Workshops 2007 (2007)Google Scholar
  2. 2.
    Balthrop, J., Forrest, S., Newman, M.E.J., Williamson, M.M.: Technological networks and the spread of computer viruses. Science 304(5670), 527–529 (2004)CrossRefGoogle Scholar
  3. 3.
    Beverly, R., Sollins, K.: Exploiting the transport-level characteristics of am. In: 5th Conference on Email and Anti-Spam, CEAS (2008)Google Scholar
  4. 4.
    Boykin, P., Roychowdhury, V.: Leveraging social networks to fight spam. IEEE Computer 38(4), 61–68 (2005)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Eleven. expurgate (June 2010),
  6. 6.
    Golbeck, J., Hendler, J.: Reputation network analysis for email filtering. In: First Conference on Email and Anti-Spam, Mountain View, California, USA (2004)Google Scholar
  7. 7.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  8. 8.
    Hao, S., Syed, N.A., Feamster, N., Gray, A.G., Krasser, S.: Detecting spammers with snare: Spatiotemporal network-level automated reputation engine. In: 18th USENIX Security Symposium (2009)Google Scholar
  9. 9.
    Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify email. Inf. Sci. 177(10), 2167–2187 (2007)CrossRefGoogle Scholar
  10. 10.
    Liu, W.: Identifying and addressing rogue servers in countering internet email misuse. In: IEEE SADFE, pp. 13–24 (2010)Google Scholar
  11. 11.
    McAfee. Trustedsource,
  12. 12.
    Namestnikova, M.: Securelist spam report (December 2011),
  13. 13.
    Qian, Z., Mao, Z.M., Xie, Y., Yu, F.: On network-level clusters for spam detection. In: NDSS (2010)Google Scholar
  14. 14.
    Ramachandran, A., Feamster, N.: Understanding the network-level behavior of spammers. In: ACM SIGCOMM, Pisa, Italy (2006)Google Scholar
  15. 15.
    Ramachandran, A., Feamster, N., Vempala, S.: Filtering spam with behavioral blacklisting. In: ACM CCS, pp. 342–351 (2007)Google Scholar
  16. 16.
    Ruiz-Sanchez, M.A., Biersack, E.W.: Survey and taxonomy of ip address lookup algorithms. IEEE Network 15(2), 8–23 (2001)CrossRefGoogle Scholar
  17. 17.
    Soldo, F., Le, A., Markopoulou, A.: Predictive blacklisting as an implicit recommendation system. In: IEEE INFOCOM, pp. 1–9 (2010)Google Scholar
  18. 18.
  19. 19.
  20. 20.
  21. 21.
    Tang, Y., Krasser, S., Judge, P., Zhang, Y.-Q.: Fast and effective spam sender detection with granular svm on highly imbalanced mail server havior data. In: CollaborateCom, Atlanta, Georgia, USA (2006)Google Scholar
  22. 22.
    West, A.G., Aviv, A.J., Chang, J., Lee, I.: Preventing malicious behavior using spatio-temporal reputation. In: ACM EUROSYS 2010 (2010)Google Scholar
  23. 23.
    Youn, S., McLeod, D.: Improved spam filtering by extraction of information from text embedded image email. In: ACM SAC, New York, USA, pp. 1754–1755 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Eitan Menahem
    • 1
  • Rami Pusiz
    • 1
  • Yuval Elovici
    • 1
  1. 1.Telekom Innovation Laboratories, Information System Engineering DepartmentBen-Gurion UniversityBe’er ShevaIsrael

Personalised recommendations