Advertisement

Artificial Intelligence Review

, Volume 40, Issue 1, pp 71–105 | Cite as

A survey of image spamming and filtering techniques

  • Abdolrahman Attar
  • Reza Moradi Rad
  • Reza Ebrahimi AtaniEmail author
Article

Abstract

Many techniques have been proposed to combat the upsurge in image-based spam. All the proposed techniques have the same target, trying to avoid the image spam entering our inboxes. Image spammers avoid the filter by different tricks and each of them needs to be analyzed to determine what facility the filters need to have for overcoming the tricks and not allowing spammers to full our inbox. Different tricks give rise to different techniques. This work surveys image spam phenomena from all sides, containing definitions, image spam tricks, anti image spam techniques, data set, etc. We describe each image spamming trick separately, and by perusing the methods used by researchers to combat them, a classification is drawn in three groups: header-based, content-based, and text-based. Finally, we discus the data sets which researchers use in experimental evaluation of their articles to show the accuracy of their ideas.

Keywords

Image spam Image classification Spam filtering techniques 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aradhye HB, Myers GK, Herson JA (2005) Image analysis for efficient categorization of image-based spam e-mail. In: Eight international conference on document analysis and recognition (ICDAR’05), IEEE, KoreaGoogle Scholar
  2. Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering using visual information. In: The 14th international conference on image analysis and processing, Modena, Italy, 10–14 September 2007. IEEE Computer Society, pp 105–110Google Scholar
  3. Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering by content obscuring detection. In: The 4th conference on email and AntiSpam, CEAS2007, Mountain View, California, USA, August 2007Google Scholar
  4. Biggio B, Fumera G, Pillai I, Roli F (2008) Improving image spam filtering using image text features. In: Fifth conference on email and anti-spam (CEAS 2008), Mountain View, CA, USAGoogle Scholar
  5. Biggio B, Fumera G, Pillai I, Roli F (2011) A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn Lett (in press)Google Scholar
  6. Blanzieri E, Bryl A (2009) A survey of learning-based techniques of email spam filtering. J Artif Intell RevGoogle Scholar
  7. Chen W, Zhang C (2009) Image spam clustering—an unsupervised approach. In: Proceedings of the first ACM workshop on multimedia in forensics, China, October 2009Google Scholar
  8. Cheng H, Qin Z, Fu C, Wang Y (2010) Novel spam image filtering framework with multi-label classification. In: International conference on communications, circuits and systems (ICCCAS), ChinaGoogle Scholar
  9. Chew M, Tygar JD (2004) Image recognition CAPTCHAs. In: 7th International information security conference, ISC2004, Palo Alto, CA, USA, September 2004Google Scholar
  10. Dredze M, Gevaryahu R, Elias-Bachrach A (2007) Learning fast classifiers for image spam. In: Proceedings of the 4th conference on email and anti-spam (CEAS), California, USAGoogle Scholar
  11. Fritsch Ch, Netter M, Reisser A, Pernul G (2010) Attacking image recognition captchas. In: The 7th international conference, TrustBus 2010, Bilbao, Spain, August, 2010Google Scholar
  12. Fumera G, Pillai I, Roli F (2006) Spam filtering based on the analysis of text information embedded into images. J Mach Learn Res 7: 2699–2720Google Scholar
  13. Fumera G, Pillai I, Roli F, Biggio B (2007) Image spam filtering using textual and visual information. In: The MIT spam conference 2007, Cambridge, USA, March 2007Google Scholar
  14. Gao Y (2009) Choudhary a active learning image spam hunter. In: 5th International symposium on visual computing (ISVC), USAGoogle Scholar
  15. Gao Y, Yang M, Zhao X, Pardo B, Wu Y, Pappas TN, Choudhary A (2008) Image spam hunter. Acoustics, speech and signal processing ICASSP. In: IEEE international conference on ICASSP, IEEE, USAGoogle Scholar
  16. Gao Y, Yang M, Choudhary A (2009)Semi supervised image spam hunter: a regularized discriminant EM approach. In: The international conference on advanced data mining and applications (ADMA), ChinaGoogle Scholar
  17. Gao Y, Choudhary A, Hua G (2010) A nonnegative sparsity induced similarity measure with application to cluster analysis of spam images. In: International conference on acoustics speech and signal processing (ICASSP), USAGoogle Scholar
  18. Gargiulo F, Sansone C (2008) Combining visual and textual features for filtering spam emails. In: 19th International conference on pattern recognition (ICPR), USAGoogle Scholar
  19. Gargiulo F, Penta A, Picariello A, Sansone C (2008) Using heterogeneous features for anti-spam filters. In: 19th International conference on database and expert systems application, ItalyGoogle Scholar
  20. Gargiulo F, Penta A, Picariello A, Sansone C (2009) A personal anti spam system based on a behaviour-knowledge space approach. Springer J Stud Comput Intell, vol 245Google Scholar
  21. Goodman J, Heckerman D, Rounthwaite R (2005) Stopping spam. Scientific American, USAGoogle Scholar
  22. Hayati P, Potdar V (2008) Evaluation of spam detection and prevention frameworks for email and image spam—a state of art. In: Proceedings of iiWAS, ACM, Linz, AustriaGoogle Scholar
  23. He P, Sun Y, Zheng W, Wen X (2008) Filtering short message spam of group sending using CAPTCHA. In: IEEE, workshop on knowledge discovery and data mining, Australia, March 2008Google Scholar
  24. He P, Wen X, Zheng W (2009) A simple method for filtering image spam. In: ACIS international conference on computer and information science, IEEE, Australia-JapanGoogle Scholar
  25. Huang H, Guo W, Zhang Y (2008) A novel method for image spam filtering. In: The 9th international conference for young computer scientistsGoogle Scholar
  26. Issac B, Raman V (2006) Spam detection proposal in regular and text-based image emails. In: IEEE Region 10 Conference TENCON, ChinaGoogle Scholar
  27. Jithesh K, Sulochana KG, Kumar RR (2003) Optical character recognition (OCR) system for Malayalam language. In: National Workshop on application of language technology in Indian languagesGoogle Scholar
  28. Johnston N (2007) Spam evolves, PDF becomes the latest threat. Anti-Spam Development at MessageLabs, A MessageLabs Whitepaper, August 2007Google Scholar
  29. Kelly N (2007) Image spam: the new email scourge. McAfee, Inc. 3965 Freedom Circle Santa Clara, CA 95054, 888.847.8766 www.mcafee.com
  30. Kim J, Kim SH, Yang HJ, Son HJ, Kim WP (2007) Text extraction for spam-mail image filtering using a text color estimation technique. In: The 20th international conference on industrial, engineering and other applications of applied intelligent systems, IEA/AIE, Japan, June 2007Google Scholar
  31. Kim H, Chang H, Lee J, Lee D (2010) BASIL: effective near-duplicate image detection using gene sequence alignment. In: 32nd European conference on information retrieval. Springer, UKGoogle Scholar
  32. Klangpraphant P, Bhattarakosol P (2010) PIMSI: A partial image SPAM inspector. In: 5th International conference on future information technology (FutureTech), ThailandGoogle Scholar
  33. Krasser S, Tang Y, Gould J, Alperovitch D, Judge P (2007) Identifying image spam based on header and file properties using C4.5 decision trees and support vector machine learning. In: Proceedings of the IEEE, workshop on information assurance, United States Military Academy, West PointGoogle Scholar
  34. Lang SR, Williams N (2010) Impeding CAPTCHA breakers with visual decryption. In: The 8th Australasian information security conference (AISC 2010), Brisbane, AustraliaGoogle Scholar
  35. Lawton G (2007) News briefs. Published by the IEEE Computer SocietyGoogle Scholar
  36. Liu Q, Zhang F, Qin Z, Wang C, Chen S, Ma Q (2010) Feature selection for image spam classification. In: International conference on communications, circuits and systems (ICCCAS), ChinaGoogle Scholar
  37. Liu T, Tsao W, Lee C (2010) A high performance image-spam filtering system. In: Ninth international symposium on distributed computing and applications to business, engineering and science, ChinaGoogle Scholar
  38. Liu Q, Qin Z, Cheng H, Wan M (2010) Efficient modeling of spam images. In: 3th International symposium on intelligent information technology and security informatics, IEEE, ChinaGoogle Scholar
  39. Ma W, Tran D, Sharma D (2006) Detecting image based spam email. In: Proceedings of the Asia-Pacific workshop on visual information processing, Asia-Pacific Workshop on Visual Information Processing, Beijing, ChinaGoogle Scholar
  40. Mehta B, Nangia S, Gupta M, Nejdl W (2008) Detecting image spam using visual features and near duplicate detection. security and privacy. ACM, BeijingGoogle Scholar
  41. Nhung N, Phuong T (2007) An efficient method for filtering image-based spam E-mail, research, innovation and vision for the future. IEEE International, VietnamGoogle Scholar
  42. Nielson J, Castro D, Aycock J (2008) Image Spam—ASCII to the Rescue!. In: 3rd International conference on malicious and unwanted software (MALWARE), USAGoogle Scholar
  43. Qu Z, Zhang Y (2009) Filtering image spam using image semantics and near-duplicate detection. In: Second international conference on intelligent computation technology and automation, IEEE, ChinaGoogle Scholar
  44. Rusu A, Govindaraju V (2004) Handwritten CAPTCHA: using the difference in the abilities of humans and machines in reading handwritten words. In: 9th International workshop on frontiers in handwriting recognition (IWFHR-9 2004), IEEE, JapanGoogle Scholar
  45. Saraubon K, Limthanmaphon B (2009) Fast effective botnet spam detection. In: Fourth international conference on computer sciences and convergence information technology, KoreaGoogle Scholar
  46. Soranamageswari M, Meena C (2010) Statistical feature extraction for classification of image spam using artificial neural networks. In: 2nd International conference on machine learning and computing, IEEE Press, Bangalore, IndiaGoogle Scholar
  47. Stone B (2006) Spam doubles, finding new ways to deliver itself. The New York Times, A01 6. EGoogle Scholar
  48. Stern H (2008) A survey of modern spam tools. In: The fifth conference on email and anti-spam, CEAS, Mountain View, USAGoogle Scholar
  49. Stuart I, Cha H, Tappert C (2004) A neural network classifier for junk mail. Springer, Link, pp 442–450Google Scholar
  50. Thomas R, Samosseiko D (2006) The game goes on: an analysis of modern spam techniques. Virus Bulletin conference, VB2006, October, CanadaGoogle Scholar
  51. Uemura M, Tabat T (2008) Design and evaluation of a Bayesian-filter-based image spam filtering method. In: International conference on information security and assurance, IEEE Press, Busan, KoreaGoogle Scholar
  52. Wang Z, Josephson W, Lv Q, Charikar M, Li K (2007) Filtering image spam with near-duplicate detection. In: Fourth conference on email and anti-spam, Mountain View, CA, USAGoogle Scholar
  53. Wang C, Zhang F, Li F, Liu Q (2010) Image spam classifcation based on low-level image features. In: IEEE international conference on communications, circuits and systems (ICCCAS 2010), Chengdu, China, July, 2010Google Scholar
  54. Wu C, Cheng K, Zhu Q, Wu Y (2005) Using visual features for anti-spam filtering. In: IEEE international conference on image processing III, pp 501–504Google Scholar
  55. Ye M, Tao T, Mai FJ, Cheng XH (2008) An spam discrimination based on mail header feature and SVM. In: The 4th international conference on wireless communications, networking and mobile computing, WiCOM ’08, China, November, 2008Google Scholar
  56. Youn S, McLeod D (2009) Improved spam filtering by extraction of information from text embedded image E-mail. In: Proceedings of the ACM symposium on applied computing, USAGoogle Scholar
  57. Zinman A, Donath J (2007) Is Britney spears spam? In: Fourth conference on email and anti-spam mountain view, California, USA, August 2007Google Scholar
  58. Zhen X, Hong-guo W, Zeng-zhen S (2009) Evaluation of image spam classification system based on AHP. In: International conference on computational intelligence and software engineering (CiSE), ChinaGoogle Scholar
  59. Zuo H, Hu W, Wu O, Chen Y, Luo G (2009) Detecting image spam using local invariant features and pyramid match kernel. In: 18th International world wide web conference (WWW), SpainGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Abdolrahman Attar
    • 1
  • Reza Moradi Rad
    • 1
  • Reza Ebrahimi Atani
    • 1
    Email author
  1. 1.Department of Computer EngineeringThe University of GuilanRashtIran

Personalised recommendations