Abstract
The number and importance of AI-based systems in all domains is growing. With the pervasive use and the dependence on AI-based systems, the quality of these systems becomes essential for their practical usage. However, quality assurance for AI-based systems is an emerging area that has not been well explored and requires collaboration between the SE and AI research communities. This paper discusses terminology and challenges on quality assurance for AI-based systems to set a baseline for that purpose. Therefore, we define basic concepts and characterize AI-based systems along the three dimensions of artifact type, process, and quality characteristics. Furthermore, we elaborate on the key challenges of (1) understandability and interpretability of AI models, (2) lack of specifications and defined requirements, (3) need for validation data and test input generation, (4) defining expected outcomes as test oracles, (5) accuracy and correctness measures, (6) non-functional properties of AI-based systems, (7) self-adaptive and self-learning characteristics, and (8) dynamic and frequently changing environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Borg, M.: The AIQ meta-testbed: pragmatically bridging academic AI testing and industrial Q needs. In: SWQD 2021. LNBIP, vol. 404, pp. 66–77. Springer, Cham (2021)
Lenarduzzi, V., Lomio, F., Moreschini, S., Taibi, D., Tamburri, D.A.: Software quality for AI: where we are now? In: SWQD 2021. LNBIP, vol. 404, pp. 43–53. Springer, Cham (2021)
ISO/IEC: ISO/IEC 25000:2005 software engineering—software product quality requirements and evaluation (square)—guide to square. Technical report, ISO (2011)
Amershi, S., et al.: Software engineering for machine learning: a case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291–300. IEEE (2019)
Zhang, J.M., Harman, M., Ma, L., Liu, Y.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. PP, 1 (2020)
Felderer, M., Russo, B., Auer, F.: On testing data-intensive software systems. In: Biffl, S., Eckhart, M., Lüder, A., Weippl, E. (eds.) Security and Quality in Cyber-Physical Systems Engineering, pp. 129–148. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25312-7_6
ISO/IEC: ISO/IEC 25012:2008 software engineering – software product quality requirements and evaluation (square) – data quality model. Technical report, ISO (2008)
ISO/IEC: ISO/IEC 25010:2011 systems and software engineering – systems and software quality requirements and evaluation (square) – system and software quality models. Technical report, ISO (2011)
Ros, R., Runeson, P.: Continuous experimentation and A/B testing: a mapping study. In: 2018 IEEE/ACM 4th International Workshop on Rapid Continuous Software Engineering (RCoSE), pp. 35–41. IEEE (2018)
Auer, F., Felderer, M.: Current state of research on continuous experimentation: a systematic mapping study. In: 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 335–344. IEEE (2018)
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2805–2824 (2019)
Goebel, R., et al.: Explainable AI: the new 42? In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 295–303. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_21
Bosch, J., Olsson, H.H., Crnkovic, I.: It takes three to tango: requirement, outcome/data, and AI driven development. In: SiBW, pp. 177–192 (2018)
Fischer, L., et al.: Applying AI in practice: key challenges and lessons learned. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 451–471. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_25
Marijan, D., Gotlieb, A., Ahuja, M.K.: Challenges of testing machine learning based systems. In: 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 101–102. IEEE (2019)
ISO/IEC/IEEE international standard - software and systems engineering–software testing–part 4: Test techniques, pp. 1–149. ISO/IEC/IEEE 29119-4:2015 (2015)
Xie, X., et al.: DeepHunter: a coverage-guided fuzz testing framework for deep neural networks. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, pp. 146–157. Association for Computing Machinery, New York (2019)
Zhang, M., Zhang, Y., Zhang, L., Liu, C., Khurshid, S.: DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 132–142. IEEE (2018)
Braiek, H.B., Khomh, F.: On testing machine learning programs. J. Syst. Softw. 164, 110542 (2020)
Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The Oracle problem in software testing: a survey. IEEE Trans. Softw. Eng. 41(5), 507–525 (2015)
Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Dwarakanath, A., et al.: Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 118–128 (2018)
Humble, J., Farley, D.: Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Pearson Education, London (2010)
Garousi, V., Felderer, M.: Developing, verifying, and maintaining high-quality automated test scripts. IEEE Softw. 33(3), 68–75 (2016)
Khritankov, A.: On feedback loops in lifelong machine learning systems. In: SWQD 2021. LNBIP, vol. 404, pp. 54–65. Springer, Cham (2021)
Eberhardinger, B., Seebach, H., Knapp, A., Reif, W.: Towards testing self-organizing, adaptive systems. In: Merayo, M.G., de Oca, E.M. (eds.) ICTSS 2014. LNCS, vol. 8763, pp. 180–185. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44857-1_13
Foidl, H., Felderer, M., Biffl, S.: Technical debt in data-intensive software systems. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 338–341. IEEE (2019)
Foidl, H., Felderer, M.: Risk-based data validation in machine learning-based software systems. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, pp. 13–18 (2019)
Kästner, C., Kang, E.: Teaching software engineering for AI-enabled systems. arXiv preprint arXiv:2001.06691 (2020)
Hulten, G.: Building Intelligent Systems. Springer, Berkeley (2018). https://doi.org/10.1007/978-1-4842-3432-7
Acknowledgements
The research reported in this paper has been partly funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK), the Federal Ministry for Digital and Economic Affairs (BMDW), and the Province of Upper Austria in the frame of the COMET - Competence Centers for Excellent Technologies Programme managed by Austrian Research Promotion Agency FFG.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Felderer, M., Ramler, R. (2021). Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session). In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds) Software Quality: Future Perspectives on Software Engineering Quality. SWQD 2021. Lecture Notes in Business Information Processing, vol 404. Springer, Cham. https://doi.org/10.1007/978-3-030-65854-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-65854-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65853-3
Online ISBN: 978-3-030-65854-0
eBook Packages: Computer ScienceComputer Science (R0)