Chauffeur model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/chauffeur (2016)
Epoch model. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/cg23 (2016)
Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., Zimmermann, T.: Software engineering for machine learning: A case study. In: Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. pp. 291–300. ICSE-SEIP ’19, IEEE Press (2019). https://doi.org/10.1109/ICSE-SEIP.2019.00042, https://doi.org/10.1109/ICSE-SEIP.2019.00042
Balunovic, M., Baader, M., Singh, G., Gehr, T., Vechev, M.: Certifying geometric robustness of neural networks. In: Advances in Neural Information Processing Systems. pp. 15287–15297 (2019)
Google Scholar
Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars. arXiv preprint \({\rm arXiv{:}1604.07316}\) (2016)
Google Scholar
Bojarski, M., Testa, D.D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., Zieba, K.: End to end learning for self-driving cars. CoRR abs/1604.07316 (2016), http://arxiv.org/abs/1604.07316
Bunel, R., Turkaslan, I., Torr, P.H., Kohli, P., Kumar, M.P.: A unified view of piecewise linear neural network verification. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. p. 4795–4804. NIPS’18, Curran Associates Inc., Red Hook, NY, USA (2018)
Google Scholar
Byun, T., Sharma, V., Vijayakumar, A., Rayadurgam, S., Cofer, D.: Input prioritization for testing neural networks (01 2019)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: Security and Privacy (SP), 2017 IEEE Symposium on. pp. 39–57. IEEE (2017)
Google Scholar
Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates (1988)
Google Scholar
Du, X., Xie, X., Li, Y., Ma, L., Liu, Y., Zhao, J.: Deepstellar: Model-based quantitative analysis of stateful deep learning systems. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. p. 477–487. ESEC/FSE 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3338906.3338954, https://doi.org/10.1145/3338906.3338954
Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: International Symposium on Automated Technology for Verification and Analysis. pp. 269–286. Springer (2017)
Google Scholar
Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Madry, A.: A rotation and a translation suffice: Fooling cnns with simple transformations. arXiv preprint \({\rm arXiv{:}1712.02779}\) (2017)
Google Scholar
Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Madry, A.: Exploring the landscape of spatial robustness. In: International Conference on Machine Learning. pp. 1802–1811 (2019)
Google Scholar
Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Mądry, A.: A rotation and a translation suffice: Fooling cnns with simple transformations. In: Proceedings of the 36th international conference on machine learning (ICML) (2019)
Google Scholar
Eniser, H.F., Gerasimou, S., Sen, A.: Deepfault: Fault localization for deep neural networks. In: Hähnle, R., van der Aalst, W. (eds.) Fundamental Approaches to Software Engineering. pp. 171–191. Springer International Publishing, Cham (2019)
Google Scholar
Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv preprint \({\rm arXiv{:}1703.00410}\) (2017)
Google Scholar
Gal, Y.: Uncertainty in Deep Learning (2016)
Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1050–1059. PMLR, New York, New York, USA (20–22 Jun 2016), http://proceedings.mlr.press/v48/gal16.html
Gao, X., Saha, R., Prasad, M., Roychoudhury, A.: Fuzz testing based data augmentation to improve robustness of deep neural networks. In: Proceedings of the 42nd International Conference on Software Engineering. ICSE 2020, ACM (2020)
Google Scholar
Gerasimou, S., Eniser, H.F., Sen, A., Çakan, A.: Importance-driven deep learning system testing. In: International Conference of Software Engineering (ICSE) (2020)
Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair,S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in neural information processing systems. pp. 2672–2680 (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Gross, D., Jansen, N., Pérez, G.A., Raaijmakers, S.: Robustness verification for classifier ensembles. In: Hung, D.V., Sokolsky, O. (eds.) Automated Technology for Verification and Analysis. pp. 271–287. Springer International Publishing, Cham (2020)
Google Scholar
Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Guo, C., Gardner, J., You, Y., Wilson, A.G., Weinberger, K.: Simple black-box adversarial attacks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 2484–2493. PMLR, Long Beach, California, USA (09–15 Jun 2019), http://proceedings.mlr.press/v97/guo19a.html
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 1321–1330. PMLR, International Convention Centre, Sydney, Australia (06–11 Aug 2017), http://proceedings.mlr.press/v70/guo17a.html
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)
Google Scholar
He, P., Meister, C., Su, Z.: Structure-invariant testing for machine translation. In: International Conference of Software Engineering (ICSE) (2020)
Google Scholar
Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: International Conference on Computer Aided Verification. pp. 3–29. Springer (2017)
Google Scholar
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features (2019), http://arxiv.org/abs/1905.02175
Islam, M.J., Nguyen, G., Pan, R., Rajan, H.: A comprehensive study on deep learning bug characteristics. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. pp. 510–520. ESEC/FSE 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3338906.3338955, https://doi.org/10.1145/3338906.3338955
Jha, S., Raj, S., Fernandes, S., Jha, S.K., Jha, S., Jalaian, B., Verma, G., Swami, A.: Attribution-based confidence metric for deep neural networks. In: Advances in Neural Information Processing Systems. pp. 11826–11837 (2019)
Google Scholar
Jiang, H., Kim, B., Gupta, M.: To trust or not to trust a classifier. In: Advances in Neural Information Processing Systems. pp. 5541–5552 (2018)
Google Scholar
Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks, pp. 97–117. Springer International Publishing, Cham (2017)
Google Scholar
Kim, J., Feldt, R., Yoo, S.: Guiding deep learning system testing using surprise adequacy. In: Proceedings of the 41st International Conference on Software Engineering. pp. 1039–1049. IEEE Press (2019)
Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. University of Toronto (05 2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp. 1097–1105 (2012)
Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint \({\rm arXiv{:}1607.02533}\) (2016)
Google Scholar
Li, Z., Ma, X., Xu, C., Cao, C., Xu, J., Lü, J.: Boosting operational dnn testing efficiency through conditioning. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. p. 499–509. ESEC/FSE 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3338906.3338930, https://doi.org/10.1145/3338906.3338930
Ma, L., Juefei-Xu, F., Sun, J., Chen, C., Su, T., Zhang, F., Xue, M., Li, B., Li, L., Liu, Y., et al.: Deepgauge: Comprehensive and multi-granularity testing criteria for gauging the robustness of deep learning systems. arXiv preprint \({\rm arXiv{:}1803.07519}\) (2018)
Google Scholar
Ma, S., Liu, Y., Lee, W.C., Zhang, X., Grama, A.: Mode: automated neural network model debugging via state differential analysis and input selection. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. pp. 175–186. ACM (2018)
Google Scholar
Ma, X., Li, B., Wang, Y., Erfani, S.M., Wijewickrema, S., Schoenebeck, G., Song, D., Houle, M.E., Bailey, J.: Characterizing adversarial subspaces using local intrinsic dimensionality. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008), http://www.jmlr.org/papers/v9/vandermaaten08a.html
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics 18(1), 50–60 (1947)
Google Scholar
Mao, C., Zhong, Z., Yang, J., Vondrick, C., Ray, B.: Metric learning for adversarial robustness. In: Advances in Neural Information Processing Systems. pp. 478–489 (2019)
Google Scholar
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Mirman, M., Gehr, T., Vechev, M.: Differentiable abstract interpretation for provably robust neural networks. In: International Conference on Machine Learning. pp. 3575–3583 (2018)
Google Scholar
Moon, S., An, G., Song, H.O.: Parsimonious black-box adversarial attacks via efficient combinatorial optimization. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 4636–4645. PMLR, Long Beach, California, USA (09–15 Jun 2019), http://proceedings.mlr.press/v97/moon19a.html
Ozdag, M., Raj, S., Fernandes, S., Velasquez, A., Pullum, L., Jha, S.K.: On the susceptibility of deep neural networks to natural perturbations. In: AISafety@IJCAI (2019)
Google Scholar
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P). pp. 372–387. IEEE (2016)
Google Scholar
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: Security and Privacy (SP), 2016 IEEE Symposium on. pp. 582–597. IEEE (2016)
Google Scholar
Pei, K., Cao, Y., Yang, J., Jana, S.: Deepxplore: Automated whitebox testing of deep learning systems. In: Proceedings of the 26th Symposium on Operating Systems Principles. pp. 1–18. ACM (2017)
Google Scholar
Pei, K., Cao, Y., Yang, J., Jana, S.: Towards practical verification of machine learning: The case of computer vision systems. arXiv preprint \({\rm arXiv{:}1712.01785}\) (2017)
Google Scholar
Pham, H.V., Lutellier, T., Qi, W., Tan, L.: Cradle: Cross-backend validation to detect and localize bugs in deep learning libraries. In: Proceedings of the 41st International Conference on Software Engineering. p. 1027–1038. ICSE ’19, IEEE Press (2019). https://doi.org/10.1109/ICSE.2019.00107, https://doi.org/10.1109/ICSE.2019.00107
Qiu, X., Meyerson, E., Miikkulainen, R.: Quantifying point-prediction uncertainty in neural networks via residual estimation with an i/o kernel. In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=rkxNh1Stvr
Sawilowsky, S.: New effect size rules of thumb. Journal of Modern Applied Statistical Methods 8, 597–599 (11 2009). https://doi.org/10.22237/jmasm/1257035100
Saxena, U.: Automold. https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library/
Sen, K., Marinov, D., Agha, G.: CUTE: A concolic unit testing engine for C. In: FSE (2005)
Google Scholar
Seshia, S.A., Desai, A., Dreossi, T., Fremont, D.J., Ghosh, S., Kim, E., Shivakumar, S., Vazquez-Chanlatte, M., Yue, X.: Formal specification for deep neural networks. In: International Symposium on Automated Technology for Verification and Analysis. pp. 20–34. Springer (2018)
Google Scholar
Shaham, U., Yamada, Y., Negahban, S.: Understanding adversarial training: Increasing local stability of neural nets through robust optimization. arXiv preprint \({\rm arXiv{:}1511.05432}\) (2015)
Google Scholar
Shankar, V., Dave, A., Roelofs, R., Ramanan, D., Recht, B., Schmidt, L.: A systematic framework for natural perturbations from videos (06 2019)
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche,G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–503 (2016), http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
SIMPSON, E.H.: Measurement of diversity. Nature 163(4148), 688–688 (1949), https://doi.org/10.1038/163688a0
Stocco, A., Weiss, M., Calzana, M., Tonella, P.: Misbehaviour prediction for autonomous driving systems. In: Proceedings of 42nd International Conference on Software Engineering. p. 12 pages. ICSE ’20, ACM (2020)
Google Scholar
Stocco, A., Weiss, M., Calzana, M., Tonella, P.: Misbehaviour prediction for autonomous driving systems. In: International Conference of Software Engineering (ICSE) (2020)
Google Scholar
Sun, Y., Wu, M., Ruan, W., Huang, X., Kwiatkowska, M., Kroening, D.: Concolic testing for deep neural networks (2018)
Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Teye, M., Azizpour, H., Smith, K.: Bayesian uncertainty estimation for batch normalized deep networks. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 4907–4916. PMLR, Stockholmsmässan, Stockholm Sweden (10–15 Jul 2018), http://proceedings.mlr.press/v80/teye18a.html
Tian, Y., Pei, K., Jana, S., Ray, B.: Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In: International Conference of Software Engineering (ICSE), 2018 IEEE conference on. IEEE (2018)
Google Scholar
Tian, Y., Zhong, Z., Ordonez, V., Kaiser, G., Ray, B.: Testing dnn image classifier for confusion & bias errors. In: International Conference of Software Engineering (ICSE) (2020)
Google Scholar
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: Attacks and defenses. arXiv preprint \({\rm arXiv{:}1705.07204}\) (2017)
Google Scholar
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Udacity: A self-driving car simulator built with Unity. https://github.com/udacity/self-driving-car-sim (2017), online; accessed 18 August 2019
Udeshi, S., Jiang, X., Chattopadhyay, S.: Callisto: Entropy-based test generation and data quality assessment for machine learning systems. In: 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). pp. 448–453 (2020)
Google Scholar
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3156–3164 (2017)
Google Scholar
Wang, J., Dong, G., Sun, J., Wang, X., Zhang, P.: Adversarial sample detection for deep neural network through model mutation testing. In: Proceedings of the 41st International Conference on Software Engineering. p. 1245–1256. ICSE ’19, IEEE Press (2019). https://doi.org/10.1109/ICSE.2019.00126, https://doi.org/10.1109/ICSE.2019.00126
Wang, S., Chen, Y., Abdou, A., Jana, S.: Mixtrain: Scalable training of formally robust neural networks. arXiv preprint \({\rm arXiv{:}1811.02625}\) (2018)
Google Scholar
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Efficient formal safety analysis of neural networks. In: Proceedings of the 32Nd International Conference on Neural Information Processing Systems. pp. 6369–6379. NIPS’18, Curran Associates Inc., USA (2018), http://dl.acm.org/citation.cfm?id=3327345.3327533
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Formal security analysis of neural networks using symbolic intervals. USENIX Security Symposium (2018)
Google Scholar
Wong, E., Schmidt, F., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: Advances in Neural Information Processing Systems. pp. 8400–8409 (2018)
Google Scholar
Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial networks. In: 27th International Joint Conference on Artificial Intelligence (IJCAI) (2018)
Google Scholar
Xiao, C., Zhu, J.Y., Li, B., He, W., Liu, M., Song, D.: Spatially transformed adversarial examples. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017)
Google Scholar
Yang, F., Wang, Z., Heinze-Deml, C.: Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness. In: Advances in Neural Information Processing Systems 32. pp. 14757–14768 (2019)
Google Scholar
Yuval Netzer, T.W., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)
Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)
Google Scholar
Zhang, H., Chan, W.K.: Apricot: A weight-adaptation approach to fixing deep learning models. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). pp. 376–387 (Nov 2019). https://doi.org/10.1109/ASE.2019.00043
Zhang, M., Zhang, Y., Zhang, L., Liu, C., Khurshid, S.: Deeproad: Gan-based metamorphic autonomous driving system testing. arXiv preprint \({\rm arXiv{:}1802.02295}\) (2018)
Google Scholar
Zhao, Z., Dua, D., Singh, S.: Generating natural adversarial examples. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Zhou, H., Li, W., Kong, Z., Guo, J., Zhang, Y., Zhang, L., Yu, B., Liu, C.: Deepbillboard: Systematic physical-world testing of autonomous driving systems. In: International Conference of Software Engineering (ICSE) (2020)
Google Scholar