DNS exfiltration detection in the presence of adversarial attacks and modified exfiltrator behaviour

Žiža, Kristijan; Tadić, Predrag; Vuletić, Pavle

doi:10.1007/s10207-023-00723-w

DNS exfiltration detection in the presence of adversarial attacks and modified exfiltrator behaviour

Regular Contribution
Published: 08 July 2023

Volume 22, pages 1865–1880, (2023)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

Kristijan Žiža¹,
Predrag Tadić¹ &
Pavle Vuletić¹

310 Accesses
4 Citations
Explore all metrics

Abstract

The Domain Name System (DNS) exfiltration is an activity in which an infected device sends data to the attacker’s server by encoding it in DNS request messages. Because of the frequent use of DNS exfiltration for malicious purposes, exfiltration detection gained attention from the research community which proposed several predominantly machine learning-based methods. The majority of previous studies used publicly available DNS exfiltration tools with the default configuration parameters, resulting in datasets created from DNS exfiltration requests that are usually significantly longer, have more DNS name labels, and higher character entropy than average regular DNS requests. This further led to overly optimistic detection rates. In this paper, we have explored some of the strategies an attacker could use to avoid exfiltration detection. First, we have explored the impact of DNS exfiltration tools’ parameter variation on the exfiltration detection accuracy. Second, we have modified the DNSExfiltrator tool to produce exfiltration requests which have significantly lower character entropy. This approach proved to be capable of deceiving classifiers based on single DNS request features. Only around 1% of modified DNS requests shorter or equal to 9 bytes, and less than one third of DNS exfiltration requests in the overall population were accurately detected. In addition, we present a methodology and an aggregated feature set (including inter-request timing statistics) which can be used for accurate DNS exfiltration in this kind of adversarial settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Adversarial Example Attacks and Defenses in DNS Data Exfiltration

ML-Based DDoS Detection and Identification Using Native Cloud Telemetry Macroscopic Monitoring

Article 20 January 2021

Detection, characterization, and profiling DoH Malicious traffic using statistical pattern recognition

Article 15 December 2023

Availability of data and material

Data is publicly available at [31]

Code availability

Code is publicly available at [30, 32]

References

New FrameworkPOS variant exfiltrates data via DNS requests (2014), G Data blog, https://www.gdatasoftware.com/blog/2014/10/23942-new-frameworkpos-variant-exfiltrates-data-via-dns-requests, Accessed on March 6 2023
Krebs B.: Deconstructing the 2014 Sally Beauty Breach (2015), Crebs on Security, https://krebsonsecurity.com/2015/05/deconstructing-the-2014-sally-beauty-breach/, Accessed on March 6th 2023
Netlab blog, New Threat: B1txor20, A Linux Backdoor Using DNS Tunnel, https://blog.netlab.360.com/b1txor20-use-of-dns-tunneling_en/, accessed on March 16th (2023)
Marinho, R.: Translating Saitama’s DNS tunneling messages, SANS Infosec handlers diary, https://isc.sans.edu/diary/Translating+Saitama%27s+DNS+tunneling+messages/28738, Accessed on March 16th (2023)
Yunakovsky S.,Pomerantsev I.: Denis and Co, Securelist by Kaspersky, https://securelist.com/denis-and-company/83671/, 2018, Accessed on March 6 (2023)
Tuna, O.F., Catak, F.O., Eskil, M.T.: TENET: a new hybrid network architecture for adversarial defense. Int. J. Inf. Secur. (2023). https://doi.org/10.1007/s10207-023-00675-1
Article Google Scholar
Sabir, B., Ullah, F., Babar, M.A., Gaire, R.: Machine learning for detecting data exfiltration. ACM Comput. Surv. 54(3), 1–47 (2021). https://doi.org/10.1145/3442181
Article Google Scholar
Wang, Y., Zhou, A., Liao, S., Zheng, R., Hu, R., Zhang, L.: A comprehensive survey on DNS tunnel detection. Comput. Netw. 197, 108322 (2021). https://doi.org/10.1016/j.comnet.2021.108322
Article Google Scholar
Ishikura, N., Kondo, D., Vassiliades, V., Iordanov, I., Tode, H.: DNS tunneling detection by cache-property-aware features. IEEE Trans. Netw. Service Manag. 18(2), 1203–1217 (2021). https://doi.org/10.1109/TNSM.2021.3078428
Article Google Scholar
Zhan, M., Li, Y., Yu, G., Li, B., Wang, W.: Detecting DNS over HTTPS based data exfiltration. Comput. Netw. 209, 108919 (2022). https://doi.org/10.1016/j.comnet.2022.108919
Article Google Scholar
Ahmed, J., Gharakheili, H.H., Raza, Q., Russell, C., Sivaraman, V.: Real-time detection of DNS exfiltration and tunneling from enterprise networks. IFIP/IEEE Sympos. Integrat. Netw. Service Manag. (IM) 2019, 649–653 (2019)
Google Scholar
Tatang, D., Quinkert, F., Holz, T.: Below the radar: spotting DNS tunnels in newly observed hostnames in the wild. APWG Sympos. Electron. Crime Res. (ECrime) 2019, 1–15 (2019). https://doi.org/10.1109/eCrime47957.2019.9037595
Article Google Scholar
CIC-Bell-DNS-EXF-2021 Dataset, A collaborative project with Bell Canada (BC) Cyber Threat Intelligence (CTI), https://www.unb.ca/cic/datasets/dns-exf-2021.html, Accessed on October 22, (2022)
Wang, S., Sun, L., Qin, S., Li, W., Liu, W.: KRTunnel: DNS channel detector for mobile devices. Comput. Secur. 120, 102818 (2022). https://doi.org/10.1016/j.cose.2022.102818
Article Google Scholar
Liu, J., Li, S., Zhang, Y., Xiao, J., Chang, P., Peng, C.: Detecting DNS tunnel through binary-classification based on behavior features. Proceedings - 16th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 11th IEEE International Conference on Big Data Science and Engineering and 14th IEEE International Conference on Embedded Software and Systems, 339-346. https://doi.org/10.1109/Trustcom/BigDataSE/ICESS.2017.256 (2017)
Bai, H., Liu, W., Liu, G., Dai, Y., Huang, S.: Application behavior identification in DNS tunnels based on spatial-temporal information. IEEE Access 9, 80639–80653 (2021). https://doi.org/10.1109/ACCESS.2021.3085500
Article Google Scholar
Xu, K., Butler, P., Saha, S., Yao, D.: DNS for massive-scale command and control. IEEE Trans. Dependable Secure Comput. 10(3), 143–153 (2013). https://doi.org/10.1109/TDSC.2013.10
Article Google Scholar
Jovanović, Ɖ., Vuletić, P.: Analysis and characterization of IoT malware command and control communication. Telfor Journal 12(2), 80–85 (2020). https://doi.org/10.5937/telfor2002080J
Paxson, V., Christodorescu, M., Javed, M., Rao, J., Sailer, R., Schales, D.L., Stoecklin, M., Thomas, K., Venema, W., Weaver, N.: Practical Comprehensive Bounds on Surreptitious Communication over DNS. 22nd USENIX Security Symposium (USENIX Security 13), 17-32. https://www.usenix.org/conference/usenixsecurity13/technical-sessions/presentation/paxson (2013)
Almusawi, A., Amintoosi, H.: DNS tunneling detection method based on multilabel support vector machine. Security and Commun. Netw. 2018, 1–9 (2018). https://doi.org/10.1155/2018/6137098
Article Google Scholar
Nadler, A., Aminov, A., Shabtai, A.: Detection of malicious and low throughput data exfiltration over the DNS protocol. Comput. Secur. 80, 36–53 (2019). https://doi.org/10.1016/j.cose.2018.09.006
Article Google Scholar
Aiello, M., Mongelli, M., Papaleo, G.: Basic classifiers for DNS tunneling detection. Proceedings - International Symposium on Computers and Communications 880–885, (2013). https://doi.org/10.1109/ISCC.2013.6755060
Chen, S., Lang, B., Liu, H., Li, D., Gao, C.: DNS covert channel detection method using the LSTM model. Comput. Secur. 104, 102095 (2021). https://doi.org/10.1016/j.cose.2020.102095
Article Google Scholar
Homem, I., Papapetrou, P., Dosis, S.: Information-Entropy-Based DNS Tunnel Prediction pp. 127-140. https://doi.org/10.1007/978-3-319-99277-8_8 (2018)
Steadman, J., Scott-Hayward, S.: DNSxD: Detecting Data Exfiltration over DNS. 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks, NFV-SDN 2018, 2013, 1-6. (2018). https://doi.org/10.1109/NFV-SDN.2018.8725640
Shafieian, S., Smith, D., Zulkernine, M.: Detecting DNS Tunneling Using Ensemble Learning (pp. 112-127). https://doi.org/10.1007/978-3-319-64701-2_9 (2017)
D’Angelo, G., Castiglione, A., Palmieri, F.: DNS tunnels detection via DNS-images. Inf. Process. Manage. 59(3), 102930 (2022). https://doi.org/10.1016/j.ipm.2022.102930
Article Google Scholar
Steadman, J., Scott-Hayward, S.: DNSxP: Enhancing data exfiltration protection through data plane programmability. Comput. Netw. 195, 108174 (2021). https://doi.org/10.1016/j.comnet.2021.108174
Article Google Scholar
Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., Hoffman, P.: Specification for DNS over Transport Layer Security (TLS), IETF RFC 7858, ISSN: 2070-1721
https://github.com/kristijanziza/dns , Accessed on March 20th, (2023)
Ziza, K., Vuletić, P., Tadić, P.: DNS Exfiltration Dataset, Mendeley Data, v2 https://doi.org/10.17632/c4n7fckkz3.2 (2022)
DNS Exfiltration classifiers, https://github.com/ptadic/dns-exfiltration, Accessed on March 4th, (2023)
Sagi, O., Rokach, L.: Ensemble learning: a survey. Wiley Interdisciplin. Rev.: Data Mining and Knowl. Dis. 8(4), e1249 (2018). https://doi.org/10.1002/widm.1249
Article Google Scholar
Rincy, T.N., Gupta, R.: Ensemble learning techniques and its efficiency in machine learning: A survey. 2nd International Conference on Data, Engineering and Applications (IDEA), 1-6. https://doi.org/10.1109/IDEA49133.2020.9170675 (2020)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to statistical learning with applications in R, Second Edition. Springer Science+Business Media, LLC. ISBN 978-1-0716-1417-4. https://doi.org/10.1007/978-1-0716-1418-1
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15(1), 3133–3181 (2014)
Wainberg, M., Alipanahi, B., Frey, B.J.: Are random forests truly the best classifiers? J. Mach. Learn. Res. 17(1), 3837–3841 (2016)
MathSciNet Google Scholar
Géron, A.: Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc. ISBN 978-1-492-03264-9 (2019)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm SIGKDD international conference on knowledge discovery and data mining (pp. 785-794) (2016)
https://github.com/dmlc/xgboost/tree/master/demo#machine-learning-challenge-winning-solutions , Accessed on March 28th, (2023)
Ho, T.K.: Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278-282). IEEE. https://doi.org/10.1109/icdar.1995.598994 (1995)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). https://doi.org/10.1109/34.709601
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324
Article MATH Google Scholar
Pedregosa, Fabian, et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.H., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). Springer, Berlin (2009)
Book MATH Google Scholar
iodine DNS exfiltration tool, https://code.kryo.se/iodine/, accessed on May 27th, (2023)
DNSexfiltrator, https://github.com/Arno0x/DNSExfiltrator, Accessed on May 27th, (2023)

Download references

Funding

This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia under Grant Agreement No. 451-03-68/2022-14/200103, and under projects TR32038 and III42007.

Author information

Authors and Affiliations

University of Belgrade, School of Electrical Engineering, Bulevar Kralja Aleksandra 73, Belgrade, Serbia
Kristijan Žiža, Predrag Tadić & Pavle Vuletić

Authors

Kristijan Žiža
View author publications
You can also search for this author in PubMed Google Scholar
Predrag Tadić
View author publications
You can also search for this author in PubMed Google Scholar
Pavle Vuletić
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Kristijan Žiža: Data curation, Investigation, Methodology, Software, Writing. Predrag Tadić: Data curation, Investigation, Validation, Software, Writing - review and editing. Pavle Vuletić: Conceptualisation, Methodology, Investigation, Resources, Writing - review and editing, Validation, Supervision

Corresponding author

Correspondence to Pavle Vuletić.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare. All co-authors have seen and agreed with the contents of the manuscript. We certify that the submission is original work and is not under review at any other publication.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Appendix A

Due to the unavailability of the datasets, tools and full descriptions of the classification methodologies in previous studies, it is not possible to compare different detection strategies in a fair and unbiased way. In order to still provide some sort of comparison with the classification methodologies used in previous research we have analysed 9 additional classification methods found in the previous research. Table 7 lists the classification results obtained for 11 machine learning models: logistic regression (LR), support vector machine with the Gaussian kernel (G-SVM), support vector machine with a linear kernel (L-SVM), naïve Bayes (NB), decision tree (DT), random forest (RF), extremely randomised trees (ERT), AdaBoost (AB), histogram-based gradient boosting (HBG), multi-layer perceptron (MLP) and XGBoost (XGB). We report the accuracies and the F1-score averaged over the two classes (exfiltration or legitimate request). We have analysed the classifiers in three different cases: 1) original examples (regular requests plus attacks generated by the unmodified exfiltrator), 2) Modified DNSexfiltrator examples, and 3) All examples (both previous groups taken together). To speed up training, we used a smaller training set, composed of around 3.5 M datapoints randomly chosen from the original dataset and 17k modified requests. We used both individual and aggregated features. The test set had the same number of original requests as the training set and 25k modified examples. XGBoost achieved the best metrics in all three categories, although the differences between most models are quite small.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Žiža, K., Tadić, P. & Vuletić, P. DNS exfiltration detection in the presence of adversarial attacks and modified exfiltrator behaviour. Int. J. Inf. Secur. 22, 1865–1880 (2023). https://doi.org/10.1007/s10207-023-00723-w

Download citation

Published: 08 July 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10207-023-00723-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DNS exfiltration detection in the presence of adversarial attacks and modified exfiltrator behaviour

Abstract

Access this article

Similar content being viewed by others

Adversarial Example Attacks and Defenses in DNS Data Exfiltration

ML-Based DDoS Detection and Identification Using Native Cloud Telemetry Macroscopic Monitoring

Detection, characterization, and profiling DoH Malicious traffic using statistical pattern recognition

Availability of data and material

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendix A: Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DNS exfiltration detection in the presence of adversarial attacks and modified exfiltrator behaviour

Abstract

Access this article

Similar content being viewed by others

Adversarial Example Attacks and Defenses in DNS Data Exfiltration

ML-Based DDoS Detection and Identification Using Native Cloud Telemetry Macroscopic Monitoring

Detection, characterization, and profiling DoH Malicious traffic using statistical pattern recognition

Availability of data and material

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendix A: Appendix A

Appendix A: Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation