Abstract
Due to the rapid growth of the internet, websites have become the intruder’s main target. An intruder embeds malicious contents in a web page for the purpose of doing some bad and unwanted-activities such as: credential information and resource theft, luring a user to visit a dangerous website, downloading and installing software to join a botnet or to participate in distributed denial of service, and even damage the visitor system. As the number of web pages increases, the malicious web pages are also increasing and the attack is increasingly become sophisticated. In this paper, we provide a framework for detecting a malicious web page using artificial neural network learning techniques. In addition to the significant detection rate, our objective is to find also which discriminative features characterize the attack and reduce the false positive rate. The algorithm is based on two features group, the URL lexical and the page content features. The experiments has shown the expected results and the high false positive rate which produced by machine learning approaches is reduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Learning to detect malicious URLs. ACM Transactions on Intelligent Systems and Technology (TIST)Â 2(3), 30 (2011)
Fukushima, Y., Hori, Y., Sakurai, K.: Proactive Blacklisting for Malicious Web Sites by Reputation Evaluation Based on Domain and IP Address Registration. IEEE (2011)
Zhang, J., Seifert, C., Stokes, J.W., Lee, W.: ARROW: GenerAting SignatuRes to Detect DRive-By DOWnloads. ACM (2011)
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious JavaScript code. ACM (2010)
Hou, Y.T., Chang, Y., Chen, T., Laih, C.S., Chen, C.M.: Malicious web content detection by machine learning. Expert Systems with Applications 37(1), 55–60 (2010)
Van Lam Le, I.W., Gao, X., Komisarczuk, P.: Two-Stage Classification Model to Detect Malicious Web Pages. IEEE (2011)
Heiderich, M., Frosch, T., Holz, T.: iceShield: Detection and mitigation of malicious websites with a frozen DOM. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 281–300. Springer, Heidelberg (2011)
Chen, K.Z., Gu, G., Zhuge, J., Nazario, J., Han, X.: WebPatrol: Automated collection and replay of web-based malware scenarios. ACM (2011)
Chitra, S., Jayanthan, K., Preetha, S., Shankar, R.N.U.: Predicate based Algorithm for Malicious Web Page Detection using Genetic Fuzzy Systems and Support Vector Machine. International Journal of Computer Applications 40(10) (2012)
Xu, K., Yao, D., Ma, Q., Crowell, A.: Detecting infection onset with behavior-based policies. IEEE (2011)
Hsu, F.H., Tso, C.K., Yeh, Y.C., Wang, W.J., Chen, L.H.: BrowserGuard: A Behavior-Based Solution to Drive-by-Download Attacks. IEEE Journal on Selected Areas in Communications 29(7), 1461–1468 (2011)
Lee, Y.J., Huang, S.Y.: Reduced support vector machines: A statistical theory. IEEE Transactions on Neural Networks 18(1), 1–13 (2007)
Kohavi, R., Quinlan, R.: C5. 1.3 Decision Tree Discovery
Elkan, C.: Nearest neighbor classification. University of California, San Diego (2007)
Gershenson, C.: Artificial neural networks for beginners. arXiv preprint cs/0308031 (2003)
Jain, A.K., Mao, J., Mohiuddin, K.M.: Artificial neural networks: A tutorial. Computer 29(3), 31–44 (1996)
Tao, W., Shunzheng, Y., Bailin, X.: A Novel Framework for Learning to Detect Malicious Web Pages. IEEE (2010)
Zhang, W., Ding, Y.X., Tang, Y., Zhao, B.: Malicious web page detection based on on-line learning algorithm. IEEE (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sirageldin, A., Baharudin, B.B., Jung, L.T. (2014). Malicious Web Page Detection: A Machine Learning Approach. In: Jeong, H., S. Obaidat, M., Yen, N., Park, J. (eds) Advances in Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41674-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-41674-3_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41673-6
Online ISBN: 978-3-642-41674-3
eBook Packages: EngineeringEngineering (R0)