Abstract
Deep neural networks (DNNs) have been increasingly used in safety-critical systems for decision-making (e.g., medical applications, autonomous vehicles, etc.). Early work with adversarial examples shows that they can hinder a DNN’s ability to correctly process inputs. These techniques largely focused on identifying individual causes of failure, emphasizing that minimal perturbations can cause a DNN to fail. Recent research suggests that diverse adversarial examples, those that cause different erroneous model behaviors, can better assess the robustness of DNNs. These techniques, however, use white-box approaches to exhaustively search for diverse DNN misbehaviors. This paper proposes a black-box, model and data-agnostic approach to generate diverse sets of adversarial examples, where each set corresponds to one type of model misbehavior (e.g., misclassification of image to a specific label), thus termed a failure category. Furthermore, each failure category comprises a diverse set of perturbations, all of which produce the same type of model misbehavior. As such, this work provides both breadth and depth-based information for addressing robustness with respect to multiple categories of failures due to adversarial examples. We illustrate our approach by applying it to popular image classification datasets using different image classification models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This work uses the term diversity to describe different DNN model behaviors.
- 2.
One definition for expound is to explain systematically or in detail.
- 3.
Kobalos is a mischievous sprite in Greek mythology.
- 4.
Tarand is a legendary creature with chameleon-like properties in Greek mythology.
References
Aghababaeyan, Z., Abdellatif, M., Dadkhah, M., Briand, L.: DeepGD: a multi-objective black-box test selection approach for deep neural networks. arXiv (2023)
Arrieta, B., et al.: Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Chan, K., Cheng, B.H.C.: EvoAttack: an evolutionary search-based adversarial attack for object detection models. In: Papadakis, M., Vergilio, S.R. (eds.) SSBSE 2022. LNCS, vol. 13711, pp. 83–97. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21251-2_6
Chen, J., et al.: POBA-GA: perturbation optimized black-box adversarial attacks via genetic algorithm. Comput. Secur. 85, 89–106 (2019)
Črepinšek, M., Liu, S.H., Mernik, M.: Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput. Surv. (CSUR) 45(3), 1–33 (2013)
Gheibi, O., Weyns, D., Quin, F.: Applying machine learning in self-adaptive systems: a systematic literature review. ACM TAAS 15(3), 1–37 (2021)
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Krizhevsky, A., et al.: Learning multiple layers of features from tiny images (2009)
Kurakin, A., et al.: Adversarial attacks and defences competition. In: Escalera, S., Weimer, M. (eds.) The NIPS ’17 Competition: Building Intelligent Systems. TSSCML, pp. 195–231. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94042-7_11
Langford, M.A., Cheng, B.H.C.: Enki: a diversity-driven approach to test and train robust learning-enabled systems. ACM TAAS 15(2), 1–32 (2021)
Langford, M.A., Cheng, B.H.C.: “Know what you know”: predicting behavior for learning-enabled systems when facing uncertainty. In: 2021 International Symposium on Software Engineering for Adaptive and Self-Managing Systems, pp. 78–89. IEEE (2021)
Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)
Lehman, J., Stanley, K.O.: Novelty search and the problem with objectives. In: Riolo, R., Vladislavleva, E., Moore, J. (eds.) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation, pp. 37–56. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-1770-5_3
Alzantot et al.: GenAttack: practical black-box attacks with gradient-free optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1111–1119 (2019)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., et al. (eds.) Advances in Neural Information Processing Systems vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)
Szegedy et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)
Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv (2016)
Rozsa, A., et al.: Adversarial diversity and hard positive generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Sandler, M., et al.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Stallkamp, J., et al.: The GTSRB: a multi-class classification competition. In: The 2011 International Joint Conference on Neural Networks, pp. 1453–1460. IEEE (2011)
Sun, L., et al.: A survey of practical adversarial example attacks. Cybersecurity 1, 1–9 (2018)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Vidnerová, P., Neruda, R.: Vulnerability of classifiers to evolutionary generated adversarial examples. Neural Netw. 127, 168–181 (2020)
Wallace, E., et al.: Trick me if you can: human-in-the-loop generation of adversarial examples for question answering. TACL 7, 387–401 (2019)
Yu, F., et al.: Interpreting and evaluating neural network robustness (2019). https://doi.org/10.24963/ijcai.2019/583. (IJCAI 2019)
Acknowledgements
We greatly appreciate Michael Austin Langford’s contributions on our preliminary work. We also greatly appreciate the insightful and detailed feedback from the reviewers. This work was supported in part by funding provided by Michigan State University, the BEACON Center, the Air Force Research Laboratory, and our industrial collaborators.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chan, K.H., Cheng, B.H.C. (2024). Expound: A Black-Box Approach for Generating Diversity-Driven Adversarial Examples. In: Arcaini, P., Yue, T., Fredericks, E.M. (eds) Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415. Springer, Cham. https://doi.org/10.1007/978-3-031-48796-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-48796-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48795-8
Online ISBN: 978-3-031-48796-5
eBook Packages: Computer ScienceComputer Science (R0)