Skip to main content

Deep Learning Classification for Encrypted Botnet Traffic: Optimising Model Performance and Resource Utilisation

  • Conference paper
  • First Online:
South African Computer Science and Information Systems Research Trends (SAICSIT 2024)

Abstract

Detection of malicious traffic on a network is critical to ensuring the safety and security of internet systems. Classical approaches to this task increasingly struggle with modern networking procedures, like encryption. Deep learning (DL) offers an alternative approach to traffic classification problems. We address two major problem classes: (1) botnet detection and (2) botnet family classification. For each problem, we explore five implementations of DL architectures: a multi-layer perceptron (MLP), shallow and deep convolutional neural network (CNN v1 and CNN v2), an autoencoder (AE) and an autoencoder + convolutional neural network (AE+CNN). Our evaluation of models for each respective problem class is based on the classification performance and computational requirements of each model. We further investigate the effect of training the models on an input with a reduced feature space, where we evaluate the impact this has in terms of a trade-off between computational and classification performance. For botnet detection, we find that all models attain good (\(\ge \)0.979 accuracy) classification performance on a normal testing set; however, this performance drops fairly substantially when evaluated on a set of unknown botnet families. Furthermore, we observed a clear trend between increased feature space and memory utilisation, while finding no evidence of a trend between inference time and feature space. For botnet classification, we found that models which implement CNN architectures outperform others by a substantial margin (\(\approx \)6 percentage points). We observe the same trend between feature space and memory utilisation, and absence of apparent relationship between feature space and inference time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Definitions for these are found in Sect. 5.1.

  2. 2.

    Detection Rate is identical to Recall.

  3. 3.

    DR and FPR defined as \(\frac{TP}{TP+FN}\), \(\frac{FP}{FP+TN}\), respectively.

  4. 4.

    The repositories can be found at: www.stratosphereips.org/datasets-overview.

  5. 5.

    openArgus can be found at: https://openargus.org.

  6. 6.

    The specific features present in these datasets are described in Table 2 in the Appendix.

  7. 7.

    Memory Profiler can be accessed at https://github.com/pythonprofilers/memory_profiler.

  8. 8.

    Formulas and calculations provided in Appendix.

References

  1. Abu Rajab, M., Zarfoss, J., Monrose, F., Terzis, A.: A multifaceted approach to understanding the botnet phenomenon. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, IMC 2006, pp. 41–52. Association for Computing Machinery, New York (2006). https://doi.org/10.1145/1177080.1177086

  2. Aceto, G., Ciuonzo, D., Montieri, A., Pescapé, A.: Mobile encrypted traffic classification using deep learning. In: 2018 Network Traffic Measurement and Analysis Conference (TMA), pp. 1–8. IEEE (2018)

    Google Scholar 

  3. Bertino, E., Islam, N.: Botnets and internet of things security. Computer 50(2), 76–79 (2017)

    Article  Google Scholar 

  4. Cheng, R.: D 2 pi : identifying malware through deep packet inspection with deep learning (2017). https://api.semanticscholar.org/CorpusID:53062187

  5. García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014). https://doi.org/10.1016/j.cose.2014.05.011, https://www.sciencedirect.com/science/article/pii/S0167404814000923

  6. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org

  7. Haddadi, F., Le Cong, D., Porter, L., Zincir-Heywood, A.N.: On the effectiveness of different botnet detection approaches. In: Lopez, J., Wu, Y. (eds.) ISPEC 2015. LNCS, vol. 9065, pp. 121–135. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17533-1_9

    Chapter  Google Scholar 

  8. Lashkari, A.H., Gil, G.D., Mamun, M.S.I., Ghorbani, A.A.: Characterization of tor traffic using time based features. In: Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, pp. 253–262. INSTICC, SciTePress (2017). https://doi.org/10.5220/0006105602530262

  9. Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18(1), 6765–6816 (2017)

    MathSciNet  Google Scholar 

  10. Lim, H.K., Kim, J.B., Kim, K., Hong, Y.G., Han, Y.H.: Payload-based traffic classification using multi-layer LSTM in software defined networks. Appl. Sci. 9(12), 2550 (2019)

    Article  Google Scholar 

  11. Lotfollahi, M., Jafari Siavoshani, M., Shirali Hossein Zade, R., Saberian, M.: Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Comput. 24(3), 1999–2012 (2020)

    Google Scholar 

  12. Marín, G., Caasas, P., Capdehourat, G.: DeepMAL - deep learning models for malware traffic detection and classification. In: Data Science – Analytics and Applications, pp. 105–112. Springer, Wiesbaden (2021). https://doi.org/10.1007/978-3-658-32182-6_16

    Chapter  Google Scholar 

  13. O’Malley, T., et al.: Kerastuner (2019). https://github.com/keras-team/keras-tuner

  14. O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)

  15. Pachhala, N., Jothilakshmi, S., Battula, B.P.: A comprehensive survey on identification of malware types and malware classification using machine learning techniques. In: 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC), pp. 1207–1214 (2021). https://doi.org/10.1109/ICOSEC51865.2021.9591763

  16. Papadogiannaki, E., Tsirantonakis, G., Ioannidis, S.: Network intrusion detection in encrypted traffic. In: 2022 IEEE Conference on Dependable and Secure Computing (DSC), pp. 1–8 (2022). https://doi.org/10.1109/DSC54232.2022.9888942

  17. Acarman, T.: Botnet detection based on network flow summary and deep learning. Int. J. Netw. Manage. 28(6), e2039 (2018). https://doi.org/10.1002/nem.2039, https://onlinelibrary.wiley.com/doi/abs/10.1002/nem.2039

  18. Piskozub, M., Gaspari, F.D., Barr-Smith, F., Mancini, L., Martinovic, I.: MalPhase: fine-grained malware detection using network flow data. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security. ACM (2021). https://doi.org/10.1145/3433210.3453101

  19. van Roosmalen, J., Vranken, H., van Eekelen, M.: Applying deep learning on packet flows for botnet detection. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 1629–1636 (2018)

    Google Scholar 

  20. Sarker, I.H.: Cyberlearning: Effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks. Internet Things 14, 100393 (2021)

    Article  Google Scholar 

  21. Stratosphere: Stratosphere laboratory datasets (2015). https://www.stratosphereips.org/datasets-overview. Accessed 13 Mar 2020

  22. Torres, P., Catania, C., Garcia, S., Garino, C.G.: An analysis of recurrent neural networks for botnet detection behavior. In: 2016 IEEE Biennial Congress of Argentina (ARGENCON), pp. 1–6. IEEE (2016)

    Google Scholar 

  23. Van Rossum, G., Drake, F.L.: Python 3 Reference Manual. CreateSpace, Scotts Valley (2009)

    Google Scholar 

  24. Villa, A., Varki, E.: Characterization of a campus internet workload. In: Proceedings of CATA, pp. 140–148 (2012)

    Google Scholar 

  25. Wang, W., et al.: HAST-IDS: learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access 6, 1792–1806 (2017)

    Article  Google Scholar 

  26. Wang, Z., Fok, K.W., Thing, V.L.: Machine learning for encrypted malicious traffic detection: approaches, datasets and comparative study. Comput. Secur. 113, 102542 (2022). https://doi.org/10.1016/j.cose.2021.102542

    Article  Google Scholar 

  27. Weisz, S., Chavula, J.: Community network traffic classification using two-dimensional convolutional neural networks. In: Sheikh, Y.H., Rai, I.A., Bakar, A.D. (eds.) AFRICOMM 2021. LNICST, pp. 128–148. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06374-9_9

    Chapter  Google Scholar 

  28. Yeo, M., et al.: Flow-based malware detection using convolutional neural network. In: 2018 International Conference on Information Networking (ICOIN), pp. 910–913 (2018). https://doi.org/10.1109/ICOIN.2018.8343255

  29. Zeng, Y., Gu, H., Wei, W., Guo, Y.: \(deep-full-range\) : a deep learning based network encrypted traffic classification and intrusion detection framework. IEEE Access 7, 45182–45190 (2019). https://doi.org/10.1109/ACCESS.2019.2908225

    Article  Google Scholar 

  30. Zhou, H., Hu, Y., Yang, X., Pan, H., Guo, W., Zou, C.C.: A worm detection system based on deep learning. IEEE Access 8, 205444–205454 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Carr .

Editor information

Editors and Affiliations

A Appendix

A Appendix

For comprehensive information, see Project Website or Github.

1.1 A.1 Feature Sets

Table 2. All the features present in flow extraction from CICFlowMeter. Bold and/or underlined indicate inclusion in \(30\%\) and \(50\%\) feature-spaces, respectively. All features present in \(100\%\) feature space.

1.2 A.2 Classification Results

(See Tables 3, 4, 5 and 6)

Table 3. Results of Binary Classifiers on Default Test Set
Table 4. Results of Binary Classifiers on Proto Zero-Day Test Set
Table 5. Computational Performance of Binary Classifiers
Table 6. Classification and Computational Performance of Multiclass Classifiers

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carr, L., Chavula, J. (2024). Deep Learning Classification for Encrypted Botnet Traffic: Optimising Model Performance and Resource Utilisation. In: Gerber, A. (eds) South African Computer Science and Information Systems Research Trends. SAICSIT 2024. Communications in Computer and Information Science, vol 2159. Springer, Cham. https://doi.org/10.1007/978-3-031-64881-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64881-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64880-9

  • Online ISBN: 978-3-031-64881-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics