Skip to main content

A Competitive Three-Level Pruning Technique for Information Security

  • Conference paper
Applications and Techniques in Information Security (ATIS 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 490))

  • 1532 Accesses

Abstract

The reduction of size of ensemble classifiers is important for various security applications. The majority of known pruning algorithms belong to the following three categories: ranking based, clustering based, and optimization based methods. The present paper introduces and investigates a new pruning technique. It is called a Three-Level Pruning Technique, TLPT, because it simultaneously combines all three approaches in three levels of the process. This paper investigates the TLPT method combining the state-of-the-art ranking of the Ensemble Pruning via Individual Contribution ordering, EPIC, the clustering of the K-Means Pruning, KMP, and the optimisation method of Directed Hill Climbing Ensemble Pruning, DHCEP, for a phishing dataset. Our new experiments presented in this paper show that the TLPT is competitive in comparison to EPIC, KMP and DHCEP, and can achieve better outcomes. These experimental results demonstrate the effectiveness of the TLPT technique in this example of information security application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abawajy, J., Kelarev, A., Chowdhury, M.: Automatic generation of meta classifiers with large levels for distributed computing and networking. Journal of Networks (to appear, 2014) (accepted in final form)

    Google Scholar 

  2. Abawajy, J., Kelarev, A., Chowdhury, M., Stranieri, A., Jelinek, H.F.: Predicting cardiac autonomic neuropathy category for diabetic data with missing values. Computers in Biology and Medicine 43, 1328–1333 (2013)

    Article  Google Scholar 

  3. Abawajy, J.H., Kelarev, A.V., Chowdhury, M.: Multistage approach for clustering and classification of ECG data. Computer Methods and Programs in Biomedicine 112, 720–730 (2013)

    Article  Google Scholar 

  4. Abdi, L., Hashemi, S.: GAB-EPA: A GA based ensemble pruning approach to tackle multiclass imbalanced problems. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part I. LNCS, vol. 7802, pp. 246–254. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  5. Ahamid, I.R., Abawajy, J., Kim, T.H.: Using feature selection and classification scheme for automating phishing email detection. Studies in Informatics and Control 22, 61–70 (2013)

    Google Scholar 

  6. Arachchilage, N.A.G., Love, S.: A game design framework for avoiding phishing attacks. Computers in Human Behavior 29, 706–714 (2013)

    Article  Google Scholar 

  7. Barraclough, P.A., Hossain, M.A., Tahir, M.A., Sexton, G., Aslam, N.: Intelligent phishing detection and protection scheme for online transactions. Expert Systems with Applications 40, 4697–4706 (2013)

    Article  Google Scholar 

  8. Dai, Q.: A competitive ensemble pruning approach based on cross-validation technique. Knowledge-Based Systems 37, 394–414 (2013)

    Article  Google Scholar 

  9. Dai, Q., Liu, Z.: ModEnPBT: A modified backtracking ensemble pruning algorithm. Applied Soft Computing 13, 4292–4302 (2013)

    Article  Google Scholar 

  10. Dazeley, R., Yearwood, J.L., Kang, B.H., Kelarev, A.V.: Consensus clustering and supervised classification for profiling phishing emails in internet commerce security. In: Kang, B.-H., Richards, D. (eds.) PKAW 2010. LNCS, vol. 6232, pp. 235–246. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Doss, R., Chandra, D., Pan, L., Zhou, W., Chowdhury, M.: Dynamic addressing in wireless sensor networks without location awareness. Journal of Information Science and Engineering 26, 443–460 (2010)

    Google Scholar 

  12. Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition, pp. 160–163 (2000)

    Google Scholar 

  13. Guo, L., Boukir, S.: Margin-based ordered aggregation for ensemble pruning. Pattern Recognition Letters 34, 603–609 (2013)

    Article  Google Scholar 

  14. Hamid, I.R.A., Abawajy, J.: Hybrid feature selection for phishing email detection. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part II. LNCS, vol. 7017, pp. 266–275. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  15. Islam, R., Abawajy, J.: A multi-tier phishing detection and filtering approach. Journal of Network and Computer Applications 36, 324–335 (2013)

    Article  Google Scholar 

  16. Islam, R., Abawajy, J., Warren, M.: Multi-tier phishing email classification with an impact of classifier rescheduling. In: 10th International Symposium on Pervasive Systems, Algorithms, and Networks, ISPAN 2009, pp. 789–793 (2009)

    Google Scholar 

  17. Islam, R., Tian, R., Batten, L., Versteeg, S.: Classification of malware based on string and function feature selection. In: CTC 2010: Proceedings of the Second Cybercrime and Trustworthy Computing Workshop, pp. 9–17 (2010)

    Google Scholar 

  18. Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. Journal of Network and Computer Applications 36, 646–656 (2013)

    Article  Google Scholar 

  19. Islam, R., Tian, R., Moonsamy, V., Batten, L.: A comparison of the classification of disparate malware collected in different time periods. Journal of Networks 7, 956–955 (2012)

    Google Scholar 

  20. Islam, R., Zhou, W., Chowdhury, M.U.: Email categorization using (2+1)-tier classification algorithms. In: Proceedings – 7th IEEE/ACIS International Conference on Computer and Information Science, IEEE/ACIS ICIS 2008, In conjunction with 2nd IEEE/ACIS Int. Workshop on e-Activity, IEEE/ACIS IWEA 2008, pp. 276–281 (2008)

    Google Scholar 

  21. Jansson, K., von Solms, R.: Phishing for phishing awareness. Behaviour & Information Technology 32, 584–593 (2013)

    Article  Google Scholar 

  22. Jelinek, H.F., Abawajy, J.H., Kelarev, A.V., Chowdhury, M.U., Stranieri, A.: Decision trees and multi-level ensemble classifiers for neurological diagnostics. AIMS Medical Science 1, 1–12 (2014)

    Article  Google Scholar 

  23. Kelarev, A., Brown, S., Watters, P., Wu, X.W., Dazeley, R.: Establishing reasoning communities of security experts for internet commerce security. In: Technologies for Supporting Reasoning Communities and Collaborative Decision Making: Cooperative Approaches, pp. 380–396. IGI Global (2011)

    Google Scholar 

  24. Layton, R., Brown, S., Watters, P.: Using differencing to increase distinctiveness for phishing website clustering. In: Cybercrime and Trustworthy Computing Workshop, CTC-2009, Brisbane, Australia (2009)

    Google Scholar 

  25. Layton, R., Watters, P., Dazeley, R.: Unsupervised authorship analysis of phishing webpages. In: 2012 International Symposium on Communications and Information Technologies, ISCIT 2012, pp. 1104–1109 (2012)

    Google Scholar 

  26. Li, N., Yu, Y., Zhou, Z.-H.: Diversity regularized ensemble pruning. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 330–345. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  27. Li, Y., Xiao, R., Feng, J., Zhao, L.: A semi-supervised learning approach for detection of phishing webpages. Optik – Int. J. Light Electron Opt. (2013), doi:10.1016/j.ijleo.2013.04.078

    Google Scholar 

  28. Lu, Z., Wu, X., Zhu, X., Bongard, J.: Ensemble pruning via individual contribution ordering. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 871–880 (2010)

    Google Scholar 

  29. Moonsamy, V., Rong, J., Liu, S., Li, G., Batten, L.: Contrasting permission patterns between clean and malicious Android applications. In: Proceedings of the 9th International Conference on Security and Privacy in Communication Networks, SECURECOMM 2013, pp. 69–85 (2013)

    Google Scholar 

  30. Moonsamy, V., Tian, R., Batten, L.: Feature reduction to speed up malware classification. In: Laud, P. (ed.) NordSec 2011. LNCS, vol. 7161, pp. 176–188. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  31. Negnevitsky, M.: Artificial Intelligence: A Guide to Intelligent Systems, 3rd edn. Addison Wesley, New York (2011)

    Google Scholar 

  32. Nguyen, A., Pan, L.: Detecting SMS-based control commands in a botnet from infected Android devices. In: Proceedings of the 3rd Workshop Applications and Technologies in Information Security, ATIS 2012, pp. 23–27 (2012)

    Google Scholar 

  33. Niu, W., Lei, J., Tong, E., Li, G., Shi, Z., Ci, S.: Context-aware service ranking in wireless sensor networks. Journal of Network and Systems Management 22, 50–74 (2014)

    Article  Google Scholar 

  34. Niu, W., Li, G., Tang, H., Shi, Z.: Multi-granularity context model for dynamic web service composition. Journal of Network and Computer Applications 34, 312–326 (2011)

    Article  Google Scholar 

  35. Ramanathan, V., Wechsler, H.: Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation. Computers & Security 34, 123–139 (2013)

    Article  Google Scholar 

  36. Shahriar, H., Zulkernine, M.: Trustworthiness testing of phishing websites: A behavior model-based approach. Future Generation Computer Systems 28, 1258–1271 (2012)

    Article  Google Scholar 

  37. Sheen, S., Aishwarya, S.V., Anitha, R., Raghavan, S.V., Bhaskar, S.M.: Ensemble pruning using harmony search. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012, Part II. LNCS, vol. 7209, pp. 13–24. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  38. Sheen, S., Anitha, R., Sirisha, P.: Malware detection by pruning of parallel ensembles using harmony search. Pattern Recognition Letters 34, 1679–1686 (2013)

    Article  Google Scholar 

  39. Stranieri, A., Abawajy, J., Kelarev, A., Huda, S., Chowdhury, M., Jelinek, H.F.: An approach for ewing test selection to support the clinical assessment of cardiac autonomic neuropathy. Artificial Intelligence in Medicine 58, 185–193 (2013)

    Article  Google Scholar 

  40. Sun, L., Versteeg, S., Boztaş, S., Yann, T.: Pattern recognition techniques for the classification of malware packers. In: Steinfeld, R., Hawkes, P. (eds.) ACISP 2010. LNCS, vol. 6168, pp. 370–390. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  41. Tissera, M., Doss, R., Li, G., Batten, L.: Information discovery in multidimensional wireless sensor networks. In: Proceedings of International Conference on information Networking, ICOIN 2013, pp. 54–59 (2013)

    Google Scholar 

  42. Tong, E., Niu, W., Li, G., Tang, D., Tang, H., Ci, S.: Bloom filter - based workflow management to enable QoS guarantee in wireless sensor networks. Journal of Network and Computer Applications 39, 38–51 (2014)

    Article  Google Scholar 

  43. Tong, E., Niu, W., Li, G., Tang, H., Tang, D., Ci, S.: Hierarchical workflow management in wireless sensor network. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS, vol. 7458, pp. 601–612. Springer, Heidelberg (2012)

    Google Scholar 

  44. Tsoumakas, G., Partalas, I., Vlahavas, I.: An ensemble pruning primer. In: Okun, O., Valentini, G. (eds.) Applications of Supervised and Unsupervised Ensemble Methods. SCI, vol. 245, pp. 1–13. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  45. Vu, H.Q., Liu, S., Li, Z., Li, G.: Microphone identification using one-class classification approach. In: Applications and Techniques in Information Security, ATIS 2011, pp. 29–37 (2011)

    Google Scholar 

  46. Wang, X., Niu, W., Li, G., Yang, X., Shi, Z.: Mining frequent agent action patterns for effective multi-agent-based web service composition. In: Cao, L., Bazzan, A.L.C., Symeonidis, A.L., Gorodetsky, V.I., Weiss, G., Yu, P.S. (eds.) ADMI 2011. LNCS, vol. 7103, pp. 211–227. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  47. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier/Morgan Kaufman, Amsterdam (2011)

    Google Scholar 

  48. Xu, Y., Niu, W., Tang, H., Li, G., Zhao, Z., Ci, S.: A policy-based web service redundancy detection in wireless sensor network. Journal of Network and Systems Management 21, 1–24 (2013)

    Article  Google Scholar 

  49. Yearwood, J., Webb, D., Ma, L., Vamplew, P., Ofoghi, B., Kelarev, A.: Applying clustering and ensemble clustering approaches to phishing profiling. In: Kennedy, P.J., Ong, K., Christen, P. (eds.) Data Mining and Analytics 2009. Proc. 8th Australasian Data Mining Conference, AusDM 2009. CRPIT, vol. 101, pp. 25–34. ACS, Melbourne (2009)

    Google Scholar 

  50. Zhang, G., Zhang, S., Wang, C., Cheng, L.: Ensemble pruning for data dependent learners. Applied Mechanics and Materials, 135-136, 522–527 (2012)

    Google Scholar 

  51. Zhou, H., Zhao, X., Wang, X.: An effective ensemble pruning algorithm based on frequent patterns. Knowledge-Based Systems 56, 79–85 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chowdhury, M., Abawajy, J., Kelarev, A., Sakurai, K. (2014). A Competitive Three-Level Pruning Technique for Information Security. In: Batten, L., Li, G., Niu, W., Warren, M. (eds) Applications and Techniques in Information Security. ATIS 2014. Communications in Computer and Information Science, vol 490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45670-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45670-5_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45669-9

  • Online ISBN: 978-3-662-45670-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics