Skip to main content

Malicious Web Page Detection: A Machine Learning Approach

  • Conference paper
Advances in Computer Science and its Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 279))

Abstract

Due to the rapid growth of the internet, websites have become the intruder’s main target. An intruder embeds malicious contents in a web page for the purpose of doing some bad and unwanted-activities such as: credential information and resource theft, luring a user to visit a dangerous website, downloading and installing software to join a botnet or to participate in distributed denial of service, and even damage the visitor system. As the number of web pages increases, the malicious web pages are also increasing and the attack is increasingly become sophisticated. In this paper, we provide a framework for detecting a malicious web page using artificial neural network learning techniques. In addition to the significant detection rate, our objective is to find also which discriminative features characterize the attack and reduce the false positive rate. The algorithm is based on two features group, the URL lexical and the page content features. The experiments has shown the expected results and the high false positive rate which produced by machine learning approaches is reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Learning to detect malicious URLs. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 30 (2011)

    Google Scholar 

  2. Fukushima, Y., Hori, Y., Sakurai, K.: Proactive Blacklisting for Malicious Web Sites by Reputation Evaluation Based on Domain and IP Address Registration. IEEE (2011)

    Google Scholar 

  3. Zhang, J., Seifert, C., Stokes, J.W., Lee, W.: ARROW: GenerAting SignatuRes to Detect DRive-By DOWnloads. ACM (2011)

    Google Scholar 

  4. Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious JavaScript code. ACM (2010)

    Google Scholar 

  5. Hou, Y.T., Chang, Y., Chen, T., Laih, C.S., Chen, C.M.: Malicious web content detection by machine learning. Expert Systems with Applications 37(1), 55–60 (2010)

    Article  Google Scholar 

  6. Van Lam Le, I.W., Gao, X., Komisarczuk, P.: Two-Stage Classification Model to Detect Malicious Web Pages. IEEE (2011)

    Google Scholar 

  7. Heiderich, M., Frosch, T., Holz, T.: iceShield: Detection and mitigation of malicious websites with a frozen DOM. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 281–300. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Chen, K.Z., Gu, G., Zhuge, J., Nazario, J., Han, X.: WebPatrol: Automated collection and replay of web-based malware scenarios. ACM (2011)

    Google Scholar 

  9. Chitra, S., Jayanthan, K., Preetha, S., Shankar, R.N.U.: Predicate based Algorithm for Malicious Web Page Detection using Genetic Fuzzy Systems and Support Vector Machine. International Journal of Computer Applications 40(10) (2012)

    Google Scholar 

  10. Xu, K., Yao, D., Ma, Q., Crowell, A.: Detecting infection onset with behavior-based policies. IEEE (2011)

    Google Scholar 

  11. Hsu, F.H., Tso, C.K., Yeh, Y.C., Wang, W.J., Chen, L.H.: BrowserGuard: A Behavior-Based Solution to Drive-by-Download Attacks. IEEE Journal on Selected Areas in Communications 29(7), 1461–1468 (2011)

    Article  Google Scholar 

  12. Lee, Y.J., Huang, S.Y.: Reduced support vector machines: A statistical theory. IEEE Transactions on Neural Networks 18(1), 1–13 (2007)

    Article  Google Scholar 

  13. Kohavi, R., Quinlan, R.: C5. 1.3 Decision Tree Discovery

    Google Scholar 

  14. Elkan, C.: Nearest neighbor classification. University of California, San Diego (2007)

    Google Scholar 

  15. Gershenson, C.: Artificial neural networks for beginners. arXiv preprint cs/0308031 (2003)

    Google Scholar 

  16. Jain, A.K., Mao, J., Mohiuddin, K.M.: Artificial neural networks: A tutorial. Computer 29(3), 31–44 (1996)

    Article  Google Scholar 

  17. Tao, W., Shunzheng, Y., Bailin, X.: A Novel Framework for Learning to Detect Malicious Web Pages. IEEE (2010)

    Google Scholar 

  18. Zhang, W., Ding, Y.X., Tang, Y., Zhao, B.: Malicious web page detection based on on-line learning algorithm. IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abubakr Sirageldin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sirageldin, A., Baharudin, B.B., Jung, L.T. (2014). Malicious Web Page Detection: A Machine Learning Approach. In: Jeong, H., S. Obaidat, M., Yen, N., Park, J. (eds) Advances in Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41674-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41674-3_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41673-6

  • Online ISBN: 978-3-642-41674-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics