A New Method for Inferring Ground-Truth Labels and Malware Detector Effectiveness Metrics

Charlton, John; Du, Pang; Xu, Shouhuai

doi:10.1007/978-3-030-89137-4_6

John Charlton¹²,
Pang Du¹³ &
Shouhuai Xu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 13005))

Included in the following conference series:

International Conference on Science of Cyber Security

897 Accesses
2 Citations

Abstract

In the context of malware detection, ground-truth labels of files are often difficult or costly to obtain; as a consequence, malware detector effectiveness metrics (e.g., false-positive and false-negative rates) are hard to measure. The unavailability of ground-truth labels also hinder the training of machine learning based malware detectors. These issues are often encountered by researchers and practitioners and force them to use various heuristics without justification. Therefore, seeking principled methods has become an important open problem. In this paper, we present a principled method for tackling the problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Charlton, J., Du, P., Cho, J.H., Xu, S.: Measuring relative accuracy of malware detectors in the absence of ground truth. In: Proceedings IEEE MILCOM, pp. 450–455 (2018)
Google Scholar
Chen, H., Cho, J., Xu, S.: Quantifying the security effectiveness of firewalls and dmzs. In: Proceedings HoTSoS 2018, pp. 9:1–9:11 (2018)
Google Scholar
Chen, H., Cho, J., Xu, S.: Quantifying the security effectiveness of network diversity. In: Proceedings HoTSoS 2018, p. 24:1 (2018)
Google Scholar
Cheng, Y., Deng, J., Li, J., DeLoach, S., Singhal, A., Ou, X.: Metrics of security. In: Cyber Defense and Situational Awareness, pp. 263–295 (2014)
Google Scholar
Cho, J., Hurley, P., Xu, S.: Metrics and measurement of trustworthy systems. In: IEEE Military Communication Conference (MILCOM 2016) (2016)
Google Scholar
Cho, J., Xu, S., Hurley, P., Mackay, M., Benjamin, T., Beaumont, M.: Stram: measuring the trustworthiness of computer-based systems. ACM Comput. Surv. 51(6), 128:1–128:47 (2019)
Google Scholar
Du, P., Sun, Z., Chen, H., Cho, J.H., Xu, S.: Statistical estimation of malware detection metrics in the absence of ground truth. IEEE T-IFS 13(12), 2965–2980 (2018)
Google Scholar
Homer, J., et al.: Aggregating vulnerability metrics in enterprise networks using attack graphs. J. Comput. Secur. 21(4), 561–597 (2013)
Article MathSciNet Google Scholar
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)
Article Google Scholar
Invernizzi, L., Benvenuti, S., Cova, M., Comparetti, P.M., Kruegel, C., Vigna, G.: Evilseed: a guided approach to finding malicious web pages. In: IEEE Symposium on Security and Privacy, pp. 428–442 (2012)
Google Scholar
Johnson, C.R., Horn, R.A.: Matrix Analysis. Cambridge University Press, Cambridge (1985)
MATH Google Scholar
Kantchelian, A., et al.: Better malware ground truth: techniques for weighting anti-virus vendor labels. In: Proceedings 2015 ACM Workshop on Artificial Intelligence and Security, pp. 45–56 (2015)
Google Scholar
Kührer, M., Rossow, C., Holz, T.: Paint it black: evaluating the effectiveness of malware blacklists. In: Proceedings Research in Attacks, Intrusions and Defenses (RAID 2014), pp. 1–21 (2014)
Google Scholar
Mireles, J., Ficke, E., Cho, J., Hurley, P., Xu, S.: Metrics towards measuring cyber agility. IEEE T-IFS 14(12), 3217–3232 (2019)
Google Scholar
Mohaisen, A., Alrawi, O.: Av-meter: an evaluation of antivirus scans and labels. In: Proceedings DIMVA, pp. 112–131 (2014)
Google Scholar
Morales, J., Xu, S., Sandhu, R.: Analyzing malware detection efficiency with multiple anti-malware programs. In: Proceedings CyberSecurity (2012)
Google Scholar
Noel, S., Jajodia, S.: A suite of metrics for network attack graph analytics. In: Network Security Metrics, pp. 141–176. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66505-4_7
Chapter Google Scholar
Pendleton, M., Garcia-Lebron, R., Cho, J., Xu, S.: A survey on systems security metrics. ACM Comput. Surv. 49(4), 62:1–62:35 (2016)
Google Scholar
Perdisci, R., ManChon, U.: Vamo: Towards a fully automated malware clustering validity analysis. In: Proceedings. ACSAC, pp. 329–338 (2012)
Google Scholar
Pritom, M., Schweitzer, K., Bateman, R., Xu, M., Xu, S.: Data-driven characterization and detection of COVID-19 themed malicious websites. In: IEEE ISI 2020 (2020)
Google Scholar
Ramos, A., Lazar, M., Filho, R.H., Rodrigues, J.J.P.C.: Model-based quantitative network security metrics: a survey. IEEE Commun. Surv. Tutorials 19(4), 2704–2734 (2017)
Article Google Scholar
Wang, L., Jajodia, S., Singhal, A.: Network Security Metrics. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66505-4
Book Google Scholar
Wang, L., Jajodia, S., Singhal, A., Cheng, P., Noel, S.: K-zero day safety: a network security metric for measuring the risk of unknown vulnerabilities. IEEE TDSC 11(1), 30–44 (2014)
Google Scholar
Xu, L., Zhan, Z., Xu, S., Ye, K.: Cross-layer detection of malicious websites. In: ACM CODASPY, pp. 141–152 (2013)
Google Scholar
Xu, L., Zhan, Z., Xu, S., Ye, K.: An evasion and counter-evasion study in malicious websites detection. In: IEEE CNS, pp. 265–273 (2014)
Google Scholar
Zhang, J., Durumeric, Z., Bailey, M., Liu, M., Karir, M.: On the mismanagement and maliciousness of networks. In: Proceedings NDSS 2014 (2014)
Google Scholar
Zhang, M., Wang, L., Jajodia, S., Singhal, A., Albanese, M.: Network diversity: a security metric for evaluating the resilience of networks against zero-day attacks. IEEE Trans. Inf. Forensics Secur. 11(5), 1071–1086 (2016)
Article Google Scholar
Zhu, S., et al.: Measuring and modeling the label dynamics of online anti-malware engines. In: 29th USENIX Security Symposium, USENIX Security 2020, 12–14, August 2020, pp. 2361–2378 (2020)
Google Scholar

Download references

Acknowledgement

We thank the reviewers for their useful comments. This work was supported in part by NSF Grant #2122631 (#1814825) and by a Grant from the State of Colorado.

Author information

Authors and Affiliations

Department of Computer Science, University of Texas at San Antonio, San Antonio, USA
John Charlton
Department of Statistics, Virginia Tech, Blacksburg, USA
Pang Du
Department of Computer Science, University of Colorado Colorado Springs, Colorado Springs, USA
Shouhuai Xu

Authors

John Charlton
View author publications
You can also search for this author in PubMed Google Scholar
Pang Du
View author publications
You can also search for this author in PubMed Google Scholar
Shouhuai Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shouhuai Xu .

Editor information

Editors and Affiliations

Fudan University, Shanghai, China
Wenlian Lu
George Mason University, Fairfax, VA, USA
Kun Sun
Computer Science Department, Columbia University, New York, NY, USA
Moti Yung
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Feng Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Charlton, J., Du, P., Xu, S. (2021). A New Method for Inferring Ground-Truth Labels and Malware Detector Effectiveness Metrics. In: Lu, W., Sun, K., Yung, M., Liu, F. (eds) Science of Cyber Security. SciSec 2021. Lecture Notes in Computer Science(), vol 13005. Springer, Cham. https://doi.org/10.1007/978-3-030-89137-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-89137-4_6
Published: 10 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89136-7
Online ISBN: 978-3-030-89137-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics