Abstract
Since the Internet is anonymous and uncontrolled, it is more open to phishing attacks, which can trick users to view malicious content in exchange for their personal information. However, the number of victims to this digital attack is significantly increasing due to inadequate security mechanisms. This research study develops a cyberbullying detection system, which can produce features from Twitter text by incorporating a point-wise mutual information approach. Further, a supervised machine learning method is developed for detecting the cyberbullying scenarios. Moreover, the proposed study has employed the sentiment, lexicon, and embedding features along with the PMI-semantic orientation. To apply extracted features, the SVM, Naive Bayes, KNN, decision tree, and random forest algorithm were employed. Experiments employing the proposed framework in a multi-class and binary setting indicate considerable potential in terms of kappa values, increased accuracy, and computed f-values. These findings imply that the proposed framework is a suitable option for recognizing the cyberbullying behavior in online social networks. Finally, the proposed outcomes and baseline features are compared by using various machine learning algorithms. The tenfold cross-validation has generated a highest accuracy of about 90.36%, and all four experiments assessed random forest algorithm based on 80% of the training dataset. The test result has also computed higher accuracy on random forest algorithm based on 20% of the test dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jain AK, Gupta BB (2018) PHISH-SAFE: URL features-based phishing detection system using machine learning. In: Cyber security. Advances inside intelligent systems and computing, vol 729. https://doi.org/10.1007/978-981-10-8536-9_44
Purbay M, Kumar D (2021) Split behavior of supervised machine learning algorithms on the behalf of phishing URL detection. Lecture notes inside electrical engineering, vol 683. https://doi.org/10.1007/978-981-15-6840-4_40
Gandotra E, Gupta D (2021) An efficient approach on the behalf of phishing detection using machine learning. In: Algorithms on the behalf of intelligent systems, Springer, Singapore.https://doi.org/10.1007/978-981-15-8711-5_12
Le H, Pham Q, Sahoo D, Hoi SCH (2017) URLNet: learning a URL representation with deep learning on the behalf of malicious URL detection. In: Conference’17, Washington, DC, USA. arXiv:1802.03162
Hong J, Kim T, Liu J, Park N, Kim SW Phishing URL detection with lexical features and blacklisted domains. In: Autonomous secure cyber systems. Springer, https://doi.org/10.1007/978-3-030-33432- 1_12.
Kumar J, Santhanavijayan A, Janet B, Rajendran B, Bindhumadhava BS (2020) Phishing website classification and detection using machine learning. In: International conference on computer communication and informatics (ICCCI), Coimbatore, India, pp 1–6, https://doi.org/10.1109/ICCCI48352.2020.9104161
Hassan YA, Abdelfettah B (2017) Using case-based reasoning on the behalf of phishing detection. Procedia Comput Sci 109:281–288
Rao RS, Pais AR (2019) Jail-Phish: an improved search engine based phishing detection system. Comput Secur 1(83):246–267
Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP (2020) An effective phishing detection model based on character level convolutional neural network from URL. Electronics 9(9):1514
AlEroud A, Karabatis G (2020) Bypassing detection of URL-based phishing attacks using generative adversarial deep neural networks. In: Proceedings of the sixth international workshop on security and privacy analytics 2020 Mar 16, pp 53–60
Gupta D, Rani R (2020) Improving malware detection using big data and ensemble learning. Comput Electron Eng 86:106729
Anirudha J, Tanuja P (2019) Phishing attack detection using feature selection techniques. In: Proceedings of international conference on communication and information processing (ICCIP). https://doi.org/10.2139/ssrn.3418542
Wu CY, Kuo CC, Yang CS (2019) A phishing detection system based on machine learning. In: International conference on intelligent computing and its emerging applications (ICEA), pp 28–32
Chiew KL, Chang EH, Tiong WK (2015) Utilisation of website logo on the behalf of phishing detection. Comput Secur 16–26
Srinivasa Rao R, Pais AR (2017) Detecting phishing websites using automation of human behavior. In: Proceedings of the 3rd ACM workshop on cyber-physical system security, ACM, pp 33–42
Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from URLs. Expert Syst Appl 117:345–357
Zamir A, Khan HU, Iqbal T, Yousaf N, Aslam F et al (2019) Phishing web site detection using diverse machine learning algorithms. Electron Libr 38(1):65–80
Almseidin M, Zuraiq AA, Al-kasassbeh M, Alnidami N Phishing detection based on machine learning and feature selection methods. Int J Interact Mob Technol 13
Tan CL, Chiew KL, Wong K (2016) PhishWHO: phishing webpage detection via identity keywords extraction and target domain name finder. Decis Support Syst 88:18–27
Gull S, Parah SA (2019) Color image authentication using dual watermarks. In: Fifth international conference on image information processing (ICIIP), pp 240–245
Giri KJ, Bashir R, Bhat JI (2019) A discrete wavelet based watermarking scheme on the behalf of authentication of medical images. Int J E-Health Med Commun 30–38
Gandotra E, Bansal D, Sofat S (2016) Malware threat assessment using fuzzy logic paradigm. Cybern Syst 29–48
Nisha S, Madheswari AN (2016) Secured authentication on the behalf of internet voting in corporate companies to prevent phishing attacks. 22(1):45–49
Kazemian HB, Ahmed S (2015) Comparisons of machine learning techniques on the behalf of detecting malicious webpages. Expert Syst Appl 42(3):1166–1177
Thomas K, Grier C, Ma J, Paxson V, Song D (2011) Design and evaluation of a real-time URL spam filtering service. In: IEEE symposium on security and privacy, pp 447–462
Firdaus A, Anuar NB, Razak MFA, Hashem IAT, Bachok S, Sangaiah AK (2018) Root exploit detection and features optimization: mobile device and blockchain based medical data management. J Med Syst 42(6)
Razak MFA, Anuar NB, Othman F, Firdaus A, Afifi F, Salleh R (2018) Bio-inspired on the behalf of features optimization and malware detection. Arab J Sci Eng
Chaudhry JA, Chaudhry SA, Rittenhouse RG (2016) Phishing attacks and defenses. Int J Secur Appl 10(1):247–256
Gowtham R, Krishnamurthi I (2014) A comprehensive and efficacious architecture on the behalf of detecting phishing webpages. Comput Secur 40:23–37
Xiang G, Hong J, Rose CP, Cranor L (2011) Cantina+. ACM Trans Inf Syst Secur 14(2):1–28
Abhilash PM, Chakradhar D (2021) Sustainability improvement of WEDM process by analysing and classifying wire rupture using kernel-based naive Bayes classifier. J Braz Soc Mech Sci Eng 43(2):1–9
Khorshid SF, Abdulazeez AM (2021) Breast cancer diagnosis based on k-nearest neighbors: a review. PalArch’s J Archaeol Egypt/Egyptol 18(4):1927–1951
Charbuty B, Abdulazeez A (2021) Classification based on decision tree algorithm on the behalf of machine learning. J Appl Sci Technol Trends 2(01):20–28
Zhang W, Wu C, Zhong H, Li Y, Wang L (2021) Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geosci Front 12(1):469–477
Chandra MA, Bedi SS (2021) Survey on SVM and their application in image classification. Int J Inf Technol 13(5):1–11
Yadav DC, Pal S (2021) An ensemble approach on the behalf of classification and prediction of diabetes mellitus disease. In: Emerging trends in data driven computing and communications. Springer, Singapore, pp 225–235
Yadav DC, Pal S (2021) Performance based evaluation of algorithms on chronic kidney disease using hybrid ensemble model in machine learning. Biomed Pharmacol J 14(3):1633–1646
Yadav DC, Pal S (2021) Discovery of thyroid disease using different ensemble methods with reduced error pruning technique. In: Computer-aided design and diagnosis methods on the behalf of biomedical applications. CRC Press, pp 293–318
Hamdan YB (2021) Construction of statistical SVM based recognition model for handwritten character recognition. J Inf Technol 3(02):92–107
Tripathi M (2021) Sentiment analysis of Nepali COVID19 tweets using NB, SVM AND LSTM. J Artif Intell 3(03):151–168
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pandey, M.K., Singh, M.K., Pal, S., Tiwari, B.B. (2023). Analysis of Phishing Base Problems Using Random Forest Features Selection Techniques and Machine Learning Classifiers. In: Jacob, I.J., Kolandapalayam Shanmugam, S., Izonin, I. (eds) Data Intelligence and Cognitive Informatics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-6004-8_5
Download citation
DOI: https://doi.org/10.1007/978-981-19-6004-8_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6003-1
Online ISBN: 978-981-19-6004-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)