Abstract
The reduction of size of ensemble classifiers is important for various security applications. The majority of known pruning algorithms belong to the following three categories: ranking based, clustering based, and optimization based methods. The present paper introduces and investigates a new pruning technique. It is called a Three-Level Pruning Technique, TLPT, because it simultaneously combines all three approaches in three levels of the process. This paper investigates the TLPT method combining the state-of-the-art ranking of the Ensemble Pruning via Individual Contribution ordering, EPIC, the clustering of the K-Means Pruning, KMP, and the optimisation method of Directed Hill Climbing Ensemble Pruning, DHCEP, for a phishing dataset. Our new experiments presented in this paper show that the TLPT is competitive in comparison to EPIC, KMP and DHCEP, and can achieve better outcomes. These experimental results demonstrate the effectiveness of the TLPT technique in this example of information security application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abawajy, J., Kelarev, A., Chowdhury, M.: Automatic generation of meta classifiers with large levels for distributed computing and networking. Journal of Networks (to appear, 2014) (accepted in final form)
Abawajy, J., Kelarev, A., Chowdhury, M., Stranieri, A., Jelinek, H.F.: Predicting cardiac autonomic neuropathy category for diabetic data with missing values. Computers in Biology and Medicine 43, 1328–1333 (2013)
Abawajy, J.H., Kelarev, A.V., Chowdhury, M.: Multistage approach for clustering and classification of ECG data. Computer Methods and Programs in Biomedicine 112, 720–730 (2013)
Abdi, L., Hashemi, S.: GAB-EPA: A GA based ensemble pruning approach to tackle multiclass imbalanced problems. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part I. LNCS, vol. 7802, pp. 246–254. Springer, Heidelberg (2013)
Ahamid, I.R., Abawajy, J., Kim, T.H.: Using feature selection and classification scheme for automating phishing email detection. Studies in Informatics and Control 22, 61–70 (2013)
Arachchilage, N.A.G., Love, S.: A game design framework for avoiding phishing attacks. Computers in Human Behavior 29, 706–714 (2013)
Barraclough, P.A., Hossain, M.A., Tahir, M.A., Sexton, G., Aslam, N.: Intelligent phishing detection and protection scheme for online transactions. Expert Systems with Applications 40, 4697–4706 (2013)
Dai, Q.: A competitive ensemble pruning approach based on cross-validation technique. Knowledge-Based Systems 37, 394–414 (2013)
Dai, Q., Liu, Z.: ModEnPBT: A modified backtracking ensemble pruning algorithm. Applied Soft Computing 13, 4292–4302 (2013)
Dazeley, R., Yearwood, J.L., Kang, B.H., Kelarev, A.V.: Consensus clustering and supervised classification for profiling phishing emails in internet commerce security. In: Kang, B.-H., Richards, D. (eds.) PKAW 2010. LNCS, vol. 6232, pp. 235–246. Springer, Heidelberg (2010)
Doss, R., Chandra, D., Pan, L., Zhou, W., Chowdhury, M.: Dynamic addressing in wireless sensor networks without location awareness. Journal of Information Science and Engineering 26, 443–460 (2010)
Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition, pp. 160–163 (2000)
Guo, L., Boukir, S.: Margin-based ordered aggregation for ensemble pruning. Pattern Recognition Letters 34, 603–609 (2013)
Hamid, I.R.A., Abawajy, J.: Hybrid feature selection for phishing email detection. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part II. LNCS, vol. 7017, pp. 266–275. Springer, Heidelberg (2011)
Islam, R., Abawajy, J.: A multi-tier phishing detection and filtering approach. Journal of Network and Computer Applications 36, 324–335 (2013)
Islam, R., Abawajy, J., Warren, M.: Multi-tier phishing email classification with an impact of classifier rescheduling. In: 10th International Symposium on Pervasive Systems, Algorithms, and Networks, ISPAN 2009, pp. 789–793 (2009)
Islam, R., Tian, R., Batten, L., Versteeg, S.: Classification of malware based on string and function feature selection. In: CTC 2010: Proceedings of the Second Cybercrime and Trustworthy Computing Workshop, pp. 9–17 (2010)
Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. Journal of Network and Computer Applications 36, 646–656 (2013)
Islam, R., Tian, R., Moonsamy, V., Batten, L.: A comparison of the classification of disparate malware collected in different time periods. Journal of Networks 7, 956–955 (2012)
Islam, R., Zhou, W., Chowdhury, M.U.: Email categorization using (2+1)-tier classification algorithms. In: Proceedings – 7th IEEE/ACIS International Conference on Computer and Information Science, IEEE/ACIS ICIS 2008, In conjunction with 2nd IEEE/ACIS Int. Workshop on e-Activity, IEEE/ACIS IWEA 2008, pp. 276–281 (2008)
Jansson, K., von Solms, R.: Phishing for phishing awareness. Behaviour & Information Technology 32, 584–593 (2013)
Jelinek, H.F., Abawajy, J.H., Kelarev, A.V., Chowdhury, M.U., Stranieri, A.: Decision trees and multi-level ensemble classifiers for neurological diagnostics. AIMS Medical Science 1, 1–12 (2014)
Kelarev, A., Brown, S., Watters, P., Wu, X.W., Dazeley, R.: Establishing reasoning communities of security experts for internet commerce security. In: Technologies for Supporting Reasoning Communities and Collaborative Decision Making: Cooperative Approaches, pp. 380–396. IGI Global (2011)
Layton, R., Brown, S., Watters, P.: Using differencing to increase distinctiveness for phishing website clustering. In: Cybercrime and Trustworthy Computing Workshop, CTC-2009, Brisbane, Australia (2009)
Layton, R., Watters, P., Dazeley, R.: Unsupervised authorship analysis of phishing webpages. In: 2012 International Symposium on Communications and Information Technologies, ISCIT 2012, pp. 1104–1109 (2012)
Li, N., Yu, Y., Zhou, Z.-H.: Diversity regularized ensemble pruning. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 330–345. Springer, Heidelberg (2012)
Li, Y., Xiao, R., Feng, J., Zhao, L.: A semi-supervised learning approach for detection of phishing webpages. Optik – Int. J. Light Electron Opt. (2013), doi:10.1016/j.ijleo.2013.04.078
Lu, Z., Wu, X., Zhu, X., Bongard, J.: Ensemble pruning via individual contribution ordering. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 871–880 (2010)
Moonsamy, V., Rong, J., Liu, S., Li, G., Batten, L.: Contrasting permission patterns between clean and malicious Android applications. In: Proceedings of the 9th International Conference on Security and Privacy in Communication Networks, SECURECOMM 2013, pp. 69–85 (2013)
Moonsamy, V., Tian, R., Batten, L.: Feature reduction to speed up malware classification. In: Laud, P. (ed.) NordSec 2011. LNCS, vol. 7161, pp. 176–188. Springer, Heidelberg (2012)
Negnevitsky, M.: Artificial Intelligence: A Guide to Intelligent Systems, 3rd edn. Addison Wesley, New York (2011)
Nguyen, A., Pan, L.: Detecting SMS-based control commands in a botnet from infected Android devices. In: Proceedings of the 3rd Workshop Applications and Technologies in Information Security, ATIS 2012, pp. 23–27 (2012)
Niu, W., Lei, J., Tong, E., Li, G., Shi, Z., Ci, S.: Context-aware service ranking in wireless sensor networks. Journal of Network and Systems Management 22, 50–74 (2014)
Niu, W., Li, G., Tang, H., Shi, Z.: Multi-granularity context model for dynamic web service composition. Journal of Network and Computer Applications 34, 312–326 (2011)
Ramanathan, V., Wechsler, H.: Phishing detection and impersonated entity discovery using Conditional Random Field and Latent Dirichlet Allocation. Computers & Security 34, 123–139 (2013)
Shahriar, H., Zulkernine, M.: Trustworthiness testing of phishing websites: A behavior model-based approach. Future Generation Computer Systems 28, 1258–1271 (2012)
Sheen, S., Aishwarya, S.V., Anitha, R., Raghavan, S.V., Bhaskar, S.M.: Ensemble pruning using harmony search. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012, Part II. LNCS, vol. 7209, pp. 13–24. Springer, Heidelberg (2012)
Sheen, S., Anitha, R., Sirisha, P.: Malware detection by pruning of parallel ensembles using harmony search. Pattern Recognition Letters 34, 1679–1686 (2013)
Stranieri, A., Abawajy, J., Kelarev, A., Huda, S., Chowdhury, M., Jelinek, H.F.: An approach for ewing test selection to support the clinical assessment of cardiac autonomic neuropathy. Artificial Intelligence in Medicine 58, 185–193 (2013)
Sun, L., Versteeg, S., Boztaş, S., Yann, T.: Pattern recognition techniques for the classification of malware packers. In: Steinfeld, R., Hawkes, P. (eds.) ACISP 2010. LNCS, vol. 6168, pp. 370–390. Springer, Heidelberg (2010)
Tissera, M., Doss, R., Li, G., Batten, L.: Information discovery in multidimensional wireless sensor networks. In: Proceedings of International Conference on information Networking, ICOIN 2013, pp. 54–59 (2013)
Tong, E., Niu, W., Li, G., Tang, D., Tang, H., Ci, S.: Bloom filter - based workflow management to enable QoS guarantee in wireless sensor networks. Journal of Network and Computer Applications 39, 38–51 (2014)
Tong, E., Niu, W., Li, G., Tang, H., Tang, D., Ci, S.: Hierarchical workflow management in wireless sensor network. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS, vol. 7458, pp. 601–612. Springer, Heidelberg (2012)
Tsoumakas, G., Partalas, I., Vlahavas, I.: An ensemble pruning primer. In: Okun, O., Valentini, G. (eds.) Applications of Supervised and Unsupervised Ensemble Methods. SCI, vol. 245, pp. 1–13. Springer, Heidelberg (2009)
Vu, H.Q., Liu, S., Li, Z., Li, G.: Microphone identification using one-class classification approach. In: Applications and Techniques in Information Security, ATIS 2011, pp. 29–37 (2011)
Wang, X., Niu, W., Li, G., Yang, X., Shi, Z.: Mining frequent agent action patterns for effective multi-agent-based web service composition. In: Cao, L., Bazzan, A.L.C., Symeonidis, A.L., Gorodetsky, V.I., Weiss, G., Yu, P.S. (eds.) ADMI 2011. LNCS, vol. 7103, pp. 211–227. Springer, Heidelberg (2012)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier/Morgan Kaufman, Amsterdam (2011)
Xu, Y., Niu, W., Tang, H., Li, G., Zhao, Z., Ci, S.: A policy-based web service redundancy detection in wireless sensor network. Journal of Network and Systems Management 21, 1–24 (2013)
Yearwood, J., Webb, D., Ma, L., Vamplew, P., Ofoghi, B., Kelarev, A.: Applying clustering and ensemble clustering approaches to phishing profiling. In: Kennedy, P.J., Ong, K., Christen, P. (eds.) Data Mining and Analytics 2009. Proc. 8th Australasian Data Mining Conference, AusDM 2009. CRPIT, vol. 101, pp. 25–34. ACS, Melbourne (2009)
Zhang, G., Zhang, S., Wang, C., Cheng, L.: Ensemble pruning for data dependent learners. Applied Mechanics and Materials, 135-136, 522–527 (2012)
Zhou, H., Zhao, X., Wang, X.: An effective ensemble pruning algorithm based on frequent patterns. Knowledge-Based Systems 56, 79–85 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chowdhury, M., Abawajy, J., Kelarev, A., Sakurai, K. (2014). A Competitive Three-Level Pruning Technique for Information Security. In: Batten, L., Li, G., Niu, W., Warren, M. (eds) Applications and Techniques in Information Security. ATIS 2014. Communications in Computer and Information Science, vol 490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45670-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-45670-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45669-9
Online ISBN: 978-3-662-45670-5
eBook Packages: Computer ScienceComputer Science (R0)