Skip to main content

Detecting Smishing Attacks Using Feature Extraction and Classification Techniques

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 95))

Abstract

Phishing scams via SMS have become a common phenomenon due to the widespread use of smartphones and the availability of mobile Internet technologies. Identifying a phishing SMS via analyzing unstructured short texts is a challenging issue in the domain of AI-driven cybersecurity. Machine learning-based techniques integrated with natural language processing have massive potentials to identify differentiating patterns between phishing and legitimate SMS. In this paper, we have experimented with several state-of-the-art machine learning algorithms on a benchmark dataset. Also, NLP-based feature extraction and feature selection steps are incorporated to build an automated phishing detection strategy. Support vector machine classifier when applied after feature extraction and feature selection has outperformed the tenfold cross-validation score of 98.27%, F1-score of 99.08% for legitimate SMS, and accuracy of 98.39%. The performance of the tested methods has been evaluated through popular evaluation metrics on a benchmark dataset.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aiyar S, Shetty NP (2018) N-gram assisted YouTube spam comment detection. Procedia Comput Sci 132:174–182. https://doi.org/10.1016/j.procs.2018.05.181, https://www.sciencedirect.com/science/article/pii/S1877050918309153. In: International conference on computational intelligence and data science

  2. Alam MN, Sarma D, Lima FF, Saha I, Ulfath RE, Hossain S (2020) Phishing attacks detection using machine learning approach. In: 2020 third international conference on smart systems and inventive technology (ICSSIT), pp 1173–1179. https://doi.org/10.1109/ICSSIT48917.2020.9214225

  3. Amir Sjarif NN, Mohd Azmi NF, Chuprat S, Sarkan HM, Yahya Y, Sam SM (2019) SMS spam message detection using term frequency-inverse document frequency and random forest algorithm. Procedia Comput Sci 161:509–515. https://doi.org/10.1016/j.procs.2019.11.150. https://www.sciencedirect.com/science/article/pii/S1877050919318617. In: The fifth information systems international conference, 23–24 July 2019, Surabaya

  4. Boukari BE, Ravi A, Msahli M (2021) Machine learning detection for smishing frauds. In: 2021 IEEE 18th annual consumer communications networking conference (CCNC), pp 1–2. https://doi.org/10.1109/CCNC49032.2021.9369640

  5. Burke-Kennedy E, Brennan J, Taylor C (2020) Bank of Ireland does U-turn after refusal to reimburse ‘smishing’ victims. https://www.irishtimes.com/business/financial-services/bank-of-ireland-does-u-turn-after-refusal-to-reimburse-smishing-victims-1.4326502

  6. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

    Google Scholar 

  7. Ghourabi A, Mahmood MA, Alzubi QM (2020) A hybrid CNN-LSTM model for SMS spam detection in Arabic and English messages. Future Internet 12(9). https://doi.org/10.3390/fi12090156. https://www.mdpi.com/1999-5903/12/9/156

  8. Goel D, Jain AK (2017) Smishing-classifier: a novel framework for detection of smishing attack in mobile environment. In: International conference on next generation computing technologies. Springer, pp 502–512

    Google Scholar 

  9. Kumar S, Pal AK, Islam SH, Hammoudeh M (2021) Secure and efficient image retrieval through invariant features selection in insecure cloud environments. Neural Comput Appl 1–26

    Google Scholar 

  10. Martens B (2021) 11 facts + stats on smishing (SMS phishing) in 2021. https://www.safetydetectives.com/blog/what-is-smishing-sms-phishing-facts/

  11. Mathew NV, Bai VR (2016) Analyzing the effectiveness of n-gram technique based feature set in a Naive Bayesian spam filter. In: 2016 international conference on emerging technological trends (ICETT), pp 1–5. https://doi.org/10.1109/ICETT.2016.7873648

  12. Meesad P, Boonrawd P, Nuipian V. A chi-square-test for word importance differentiation in text classification

    Google Scholar 

  13. Mishra S, Soni D (2020) Smishing detector: a security model to detect smishing through SMS content analysis and URL behavior analysis. Future Gener Comput Syst 108:803–815. https://doi.org/10.1016/j.future.2020.03.021https://www.sciencedirect.com/science/article/pii/S0167739X19318758

  14. Mobile phishing increases more than 300% as 2020 chaos continues | Proofpoint US (2021). https://www.proofpoint.com/us/blog/threat-protection/mobile-phishing-increases-more-300-2020-chaos-continues

  15. Saleem J, Hammoudeh M (2018) Defense methods against social engineering attacks. In: Computer and network security essentials. Springer, pp 603–618

    Google Scholar 

  16. Sarker IH (2021) Data science and analytics: an overview from data-driven smart computing, decision-making and applications perspective. SN Comput Sci

    Google Scholar 

  17. Sarker IH (2021) Cyberlearning: effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks. Internet Things 14:100393

    Article  Google Scholar 

  18. Sarker IH (2021) Deep cybersecurity: a comprehensive overview from neural network and deep learning perspective. SN Comput Sci 2(3):1–16

    MathSciNet  Google Scholar 

  19. Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):1–21

    MathSciNet  Google Scholar 

  20. Sarker IH, Furhad MH, Nowrozy R (2021) AI-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput Sci 2(3):1–18

    Google Scholar 

  21. Sonowal G (2020) Detecting phishing SMS based on multiple correlation algorithms. SN Comput Sci 1(6):1–9

    Article  Google Scholar 

  22. Sonowal G, Kuppusamy KS (2018) SmiDCA: an anti-smishing model with machine learning approach. Comput J 61(8):1143–1157. https://doi.org/10.1093/comjnl/bxy039

  23. UCI machine learning repository: SMS spam collection data set (2012). https://archive.ics.uci.edu/ml/datasets/sms+spam+collection

  24. Walker-Roberts S, Hammoudeh M, Aldabbas O, Aydin M, Dehghantanha A (2020) Threats on the horizon: understanding security threats in the era of cyber-physical systems. J Supercomput 76(4):2643–2664

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iqbal H. Sarker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ulfath, R.E., Sarker, I.H., Chowdhury, M.J.M., Hammoudeh, M. (2022). Detecting Smishing Attacks Using Feature Extraction and Classification Techniques. In: Arefin, M.S., Kaiser, M.S., Bandyopadhyay, A., Ahad, M.A.R., Ray, K. (eds) Proceedings of the International Conference on Big Data, IoT, and Machine Learning. Lecture Notes on Data Engineering and Communications Technologies, vol 95. Springer, Singapore. https://doi.org/10.1007/978-981-16-6636-0_51

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-6636-0_51

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-6635-3

  • Online ISBN: 978-981-16-6636-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics