Skip to main content
Log in

Detecting redirection spam using multilayer perceptron neural network

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Quality information retrieval from Web is essential for every search engine. But the quality of information is being exploited by spammers who make heavy use of malicious redirections for the purpose of phishing, downloading malware or attaining high search engine ranking. Malicious redirections present the irrelevant content to search user, thereby affecting user satisfaction. It also leads to wastage of network bandwidth. In this paper, we propose a neural framework for detecting redirection spam. We incorporated the feed-forward multilayer perceptron network and used scaled conjugate gradient algorithm that is able to perform very fast classification of URLs leading to redirection spam. We investigated the network empirically to choose the number of hidden layers and observed that when network is trained with two hidden layers, it gives better accuracy. We validated our proposed approach against the dataset of 2383 URLs and were able to detect the spammed redirections with high accuracy. The results indicate that neural networks are very effective technique to model the redirection spam detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Canali D, Cova M, Vigna G, Kruegel C (2011) Prophiler: A fast filter for the large-scale detection of malicious web pages categories and subject descriptors. In: International world wide web conference. ACM, Hyderabad, pp 197–206

  • Castiglione A, De Santis A, Fiore U, Palmieri F (2012) An asynchronous covert channel using spam. Comput Math Appl 63:437–447. doi:10.1016/j.camwa.2011.07.068

    Article  Google Scholar 

  • Castillo C, Davison BD (2011) Adversarial web search. In: Foundations and trends\({}^{\textregistered }\) in information retrieval, pp 377–486

  • Chellapilla K, Maykov A (2007) A taxonomy of JavaScript redirection spam. In: Proceedings of the 3rd international workshop on adversarial information retrieval on the web—AIRWeb ’07, pp 81–88

  • CloudMark (2015) Annual security threat report. https://www.cloudmark.com/releases/docs/threat_report/cloudmark-security-threat-report-annual-2015.pdf. Accessed 9 June 2016

  • Dell (2016) Security annual threat report. http://www.netthreat.co.uk/assets/assets/dell-security-annual-threat-report-2016-white-paper-197571.pdf. Accessed 9 June 2016

  • Demuth H, Beale M, Hagan M (2010) Neural network\(\text{toolbox}^{\text{ TM }}\) 6. User guide

  • Duan Z, Chen P, Sanchez F et al (2012) Detecting spam zombies by monitoring outgoing messages. IEEE Trans Dependable Secure Comput 9:198–210. doi:10.1109/TDSC.2011.49

    Article  Google Scholar 

  • Elssied NOF, Ibrahim O, Osman AH (2015) Enhancement of spam detection mechanism based on hybrid k-mean clustering and support vector machine. Soft Comput 19:3237–3248. doi:10.1007/s00500-014-1479-2

    Article  Google Scholar 

  • Eshete B (2013) Effective analysis, characterization, and detection of malicious web pages. In: Proceedings of the 22nd international conference on world wide web companion, pp 355–360

  • Fukushima Y, Hori Y, Sakurai K (2011) Proactive blacklisting for malicious web sites by reputation evaluation based on domain and IP address registration. In: In Proceedings of 10th international conference on trust, security and privacy in computing and communications. IEEE, pp 352–361

  • Gu B, Sheng VS, Tay KY et al (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26:1403–1416

    Article  MathSciNet  Google Scholar 

  • Gu B, Sheng VS, Wang Z et al (2015) Incremental learning for \({\upnu }\)-Support Vector Regression. Neural Netw 67:140–150. doi:10.1016/j.neunet.2015.03.013

    Article  Google Scholar 

  • Gyongyi Z, Garcia-Molina H (2005) Spam: it’s not just for inboxes. Computer 38:28–34

    Article  Google Scholar 

  • Hans K, Ahuja L, Muttoo SK (2013) Characterization and detection of Redirection Spam. In: Wilkes-100 international conference on computing sciences (ICCS’13). Elsevier, Jallandhar, India, pp 325–331

  • Hans K, Ahuja L, Muttoo SK (2014) Approaches for web spam detection. Int J Comput Appl 101:975–987

    Google Scholar 

  • Hans K, Ahuja L, Muttoo SK (2016) A fuzzy logic approach for detecting redirection spam. Int J Electron Secur Digit Forensics 8:191–204

    Article  Google Scholar 

  • Haykin S (1999) Neural networks: a comprehensive foundation, 2nd edn. Prentice-Hall, Washington

    MATH  Google Scholar 

  • Henzinger MR, Motwani R, Silverstein C (2002) Challenges in web search engines. ACM SIGIR Forum 36:11–22. doi:10.1145/792550.792553

    Article  Google Scholar 

  • Johansson EM, Dowla FU, Goodman DM (1991) Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. Int J Neural Syst 2:291–301. doi:10.1142/S0129065791000261

    Article  Google Scholar 

  • Bhargrava VK, Brewer D, Li K (2009) A study of URL redirection indicating spam. In: Sixth conference on e-mail and anti-spam. Steve Sheng’s Publications, California, USA, pp 1–4

  • Lee S, Kim J (2013) Warningbird: a near real-time detection system for suspicious urls in twitter stream. IEEE Trans Dependable Secure Comput 10:183–195

    Article  Google Scholar 

  • Lee S, Kim J (2012) WARNING B IRD?: detecting suspicious URLs in twitter stream. In: Network and distributed system security symposium (NDSS). San Diego, USA

  • Leontiadis N, Moore T, Christin N (2011) Measuring and analyzing search-redirection attacks in the illicit online prescription drug trade. In: 20th USENIX security symposium. San Francisco, CA, pp 1–17

  • Li Z, Zhang K, Xie Y et al (2012) Knowing your enemy: understanding and detecting malicious web advertising. In: 19th conference on computer and communications security. ACM, Harvard, pp 674–686

  • Lu L, Perdisci R, Lee W (2011) SURF: detecting and measuring search poisoning. In: Proceedings of the 18th ACM conference on computer and communications security. ACM, Chicago, USA, pp 467–476

  • Ma J, Saul LK, Savage S, Voelker GM (2011) Learning to detect malicious URLs. ACM Trans Intell Syst Technol 2:1–24. doi:10.1145/1961189.1961202

    Google Scholar 

  • Ma J, Saul LK, Savage S, Voelker GM (2009) Beyond blacklists? Learning to detect malicious web sites from suspicious URLs. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Paris, France, pp 1245–1253

  • Mekky H, Torres R, Zhang ZL et al (2014) Detecting malicious HTTP redirections using trees of user browsing activity. In: IEEE conference on computer communications. IEEE, Canada, pp 1159–1167

  • Møller M (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6:525–533

    Article  Google Scholar 

  • Niu Y, Wang Y-M, Chen H, et al (2007) A quantitative study of forum spamming using context-based analysis cloaking redirection. In: Proceedings of 15th network and distributed system security (NDSS) symposium. San Diego, USA, pp 1–15

  • OpenDNS (2015) PhishTank. https://www.phishtank.com/. Accessed 8 Nov 2015

  • Prieto VM, Álvarez M, López-García R, Cacheda F (2012) Analysing the effectiveness of crawlers on the client-side hidden web. In: Trends in practical applications of agents and multiagent systems. Springer Berlin Heidelberg, New Delhi, India, pp 141–148

  • Ruan G, Tan Y (2010) A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Comput 14:139–150. doi:10.1007/s00500-009-0440-2

    Article  Google Scholar 

  • Sophos (2013) Security threat report 2013. https://www.sophos.com/en-us/medialibrary/PDFs/other/sophossecuritythreatreport2013.pdf. Accessed 7 Sept 2016

  • Takata Y, Goto S, Mori T (2011) Analysis of redirection caused by web-based malware. In: Proceedings of the Asia-Pacific advanced network, pp 53–62

  • Tao W, Shunzheng Y, Bailin X (2010) A novel framework for learning to detect malicious web pages. In: International forum on information technology and applications (IFITA). IEEE, China, pp 353–357

  • Thomas K, Grier C, Ma J et al (2011) Design and evaluation of a real-time URL spam filtering service. In: Symposium on security and privacy. IEEE, California, USA, pp 447–462

  • Wang Y, Ma M, Niu Y, Chen H (2007) Spam double-funnel: connecting web spammers with advertisers. In: Proceedings of the 16th international conference on world wide web. ACM, Alberta, Canada, pp 291–300

  • Wang YM, Ma M (2007) Strider search ranger: towards an autonomic anti-spam search engine. In: Fourth international conference on autonomic computing. IEEE, Florida, USA, pp 32–42

  • Watson MR, Marnerides AK, Mauthe A, Hutchison David (2016) Malware detection in cloud computing infrastructures. IEEE Trans Dependable Secure Comput 13:192–205

    Article  Google Scholar 

  • Websense (2014) Threat report. http://www.websense.com/assets/reports/report-2014-threat-report-en.pdf. Accessed 9 June 2016

  • Wen S, Zhou W, Zhang J et al (2014) Modeling and analysis on the propagation dynamics of modern email malware. IEEE Trans Dependable Secure Comput 11:361–374. doi:10.1109/TDSC.2013.49

    Article  Google Scholar 

  • Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406. doi:10.1016/j.ins.2014.10.040

    Article  Google Scholar 

  • Wu B, Davison B (2005) Cloaking and redirection: a preliminary study. In: First international workshop on adversarial information retrieval on the web (AIRWeb’05). ACM, Chiba, Japan, pp 7–16

  • Xia Z, Wang X, Sun X et al (2016a) Steganalysis of LSB matching using differences between nonadjacent pixels. Multimed Tools Appl 75:1947–1962. doi:10.1007/s11042-014-2381-8

    Article  Google Scholar 

  • Xia Z, Wang X, Sun X, Wang B (2014) Steganalysis of least significant bit matching using multi-order differences. Secur Commun Netw 7:1283–1291. doi:10.1002/sec.864

    Article  Google Scholar 

  • Xia Z, Wang X, Zhang L et al (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Trans Inf Forensics Secur 11:2594–2608. doi:10.1109/TIFS.2016.2590944

    Article  Google Scholar 

  • Xie S, Wang Y (2014) Construction of tree network with limited delivery latency in homogeneous wireless sensor networks. Wireless Pers Commun 78:231–246. doi:10.1007/s11277-014-1748-5

    Article  Google Scholar 

  • Yue X, Abraham A, Chi ZX et al (2007) Artificial immune system inspired behavior-based anti-spam filter. Soft Comput 11:729–740. doi:10.1007/s00500-006-0116-0

    Article  Google Scholar 

  • Zhang W, Ding Y-X, Tang Y, Zhao B (2011) Malicious web page detection based on on-line learning algorithm. Int Conf Mach Learn Cybern 2011:1914–1919. doi:10.1109/ICMLC.2011.6016954

    Google Scholar 

  • Zheng Y, Jeon B, Xu D et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28:961–973. doi:10.3233/IFS-141378

    Google Scholar 

  • Zhou J, Ding Y (2012) An analysis of URLs generated from JavaScript code. In: Proceedings of 11th international conference on computer and information science. IEEE, Shanghai, China, pp 688–693

  • Zhou Z, Wang Y, Wu QMJ et al (2016) Effective and efficient global context verification for image copy detection. IEEE Trans Inf Forensics Secur 12:48–63. doi:10.1109/TIFS.2016.2601065

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kanchan Hans.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

Consent to submit has been received explicitly from all co-authors, as well as from the responsible authorities—tacitly or explicitly—at the institute/organization where the work has been carried out before the work is submitted.

Research involving human participants and/or animals

Our research does not include human participants or animals.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hans, K., Ahuja, L. & Muttoo, S.K. Detecting redirection spam using multilayer perceptron neural network. Soft Comput 21, 3803–3814 (2017). https://doi.org/10.1007/s00500-017-2531-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2531-9

Keywords

Navigation