Skip to main content

Extra-Tree Classifier with Metaheuristics Approach for Email Classification

  • Conference paper
  • First Online:
Advances in Computer Communication and Computational Sciences

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 924))

Abstract

It is very normal for any user to receive hundreds of emails every day. Almost 93% of them are spam messages which include mainly advertisements from the industries like software, phishing, gambling, stocks, electronics, pharmaceutical, loan, and malware attempts etc. Spams messages not only waste user’s time but also eats up user valuable space. In this paper, a nature inspired metaheuristics technique has been used for email classification which emphasizes on reducing false-positive problem of treating spam messages as ham. It uses metaheuristics-based feature selection methods and employs extra-tree classifier to classify emails into spam and ham. The proposed model has accuracy of 95.5%, specificity of 93.7%, and F1-score of 96.3%, which is clearly a major improvement over the previous researches which have been conducted in this field using decision trees. The comparative analysis of extra-tree classifiers with other classifiers like decision trees and random forest has also been studied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Idris, I., Selamat, A., Nguyen, N.T., Omatu, S., Krejcar, O., Kuca, K., Penhaker, M.: A combined negative selection algorithm–particle swarm optimization for an email spam detection system. Eng. Appl. Artif. Intell. 39, 33–44 (2015)

    Article  Google Scholar 

  2. Brezočnik, L.: Feature selection for classification using particle swarm optimization. In: 17th International Conference on Smart Technologies, IEEE EUROCON 2017, pp. 966–971. IEEE (2017)

    Google Scholar 

  3. Chakraborty, B.: Feature subset selection by particle swarm optimization with fuzzy fitness function. In: 3rd International Conference on Intelligent System and Knowledge Engineering, 2008, ISKE 2008, vol. 1, pp. 1038–1042. IEEE (2008)

    Google Scholar 

  4. Wang, Y., Liu, Y., Feng, L., Zhu, X.: Novel feature selection method based on harmony search for email classification. Knowl.-Based Syst. 73, 311–323 (2015)

    Article  Google Scholar 

  5. Zhang, Y., Wang, S., Phillips, P., Ji, G.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl.-Based Syst. 64, 22–31 (2014)

    Article  Google Scholar 

  6. Sharaff, A., Nagwani, N.K.: Identifying categorical terms based on latent Dirichlet allocation for email categorization. In: Emerging Technologies in Data Mining and Information Security, pp. 431–437. Springer, Singapore (2019)

    Google Scholar 

  7. Aski, A.S., Sourati, N.K.: Proposed efficient algorithm to filter spam using machine learning techniques. Pac. Sci. Rev. A: Nat. Sci. Eng. 18(2), 145–149 (2016)

    Google Scholar 

  8. Cohen, A., Nissim, N., Elovici, Y.: Novel set of general descriptive features for enhanced detection of malicious emails using machine learning methods. Expert. Syst. Appl. (2018)

    Google Scholar 

  9. Almeida, T.A., Silva, T.P., Santos, I., Hidalgo, J.M.G.: Text normalization and semantic indexing to enhance instant messaging and SMS spam filtering. Knowl.-Based Syst. 108, 25–32 (2016)

    Article  Google Scholar 

  10. Proença, H.M., Vieira, S.M., Kaymak, U., Almeida, R.J., Sousa, J.M.: Optimizing probabilistic fuzzy systems for classification using metaheuristics. In: 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1635–1641. IEEE (2016)

    Google Scholar 

  11. Sharaff, A., Nagwani, N.K., Dhadse, A.: Comparative study of classification algorithms for spam email detection. In: Emerging Research in Computing, Information, Communication and Applications, pp. 237–244. Springer, New Delhi (2016)

    Google Scholar 

  12. Dong, H., Li, T., Ding, R., Sun, J.: A novel hybrid genetic algorithm with granular information for feature selection and optimization. Appl. Soft Comput. 65, 33–46 (2018)

    Article  Google Scholar 

  13. Polat, K., Güneş, S.: A novel hybrid intelligent method based on C4. 5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst. Appl. 36(2), 1587–1592 (2009)

    Article  Google Scholar 

  14. Wei, J., Zhang, R., Yu, Z., Hu, R., Tang, J., Gui, C., Yuan, Y.: A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl. Soft Comput. 58, 176–192 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aakanksha Sharaff .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharaff, A., Gupta, H. (2019). Extra-Tree Classifier with Metaheuristics Approach for Email Classification. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_17

Download citation

Publish with us

Policies and ethics