Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session)

Felderer, Michael; Ramler, Rudolf

doi:10.1007/978-3-030-65854-0_3

Michael Felderer¹¹ &
Rudolf Ramler¹²

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 404))

Included in the following conference series:

International Conference on Software Quality

2003 Accesses
23 Citations

Abstract

The number and importance of AI-based systems in all domains is growing. With the pervasive use and the dependence on AI-based systems, the quality of these systems becomes essential for their practical usage. However, quality assurance for AI-based systems is an emerging area that has not been well explored and requires collaboration between the SE and AI research communities. This paper discusses terminology and challenges on quality assurance for AI-based systems to set a baseline for that purpose. Therefore, we define basic concepts and characterize AI-based systems along the three dimensions of artifact type, process, and quality characteristics. Furthermore, we elaborate on the key challenges of (1) understandability and interpretability of AI models, (2) lack of specifications and defined requirements, (3) need for validation data and test input generation, (4) defining expected outcomes as test oracles, (5) accuracy and correctness measures, (6) non-functional properties of AI-based systems, (7) self-adaptive and self-learning characteristics, and (8) dynamic and frequently changing environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Borg, M.: The AIQ meta-testbed: pragmatically bridging academic AI testing and industrial Q needs. In: SWQD 2021. LNBIP, vol. 404, pp. 66–77. Springer, Cham (2021)
Google Scholar
Lenarduzzi, V., Lomio, F., Moreschini, S., Taibi, D., Tamburri, D.A.: Software quality for AI: where we are now? In: SWQD 2021. LNBIP, vol. 404, pp. 43–53. Springer, Cham (2021)
Google Scholar
ISO/IEC: ISO/IEC 25000:2005 software engineering—software product quality requirements and evaluation (square)—guide to square. Technical report, ISO (2011)
Google Scholar
Amershi, S., et al.: Software engineering for machine learning: a case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291–300. IEEE (2019)
Google Scholar
Zhang, J.M., Harman, M., Ma, L., Liu, Y.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. PP, 1 (2020)
Google Scholar
Felderer, M., Russo, B., Auer, F.: On testing data-intensive software systems. In: Biffl, S., Eckhart, M., Lüder, A., Weippl, E. (eds.) Security and Quality in Cyber-Physical Systems Engineering, pp. 129–148. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25312-7_6
Chapter Google Scholar
ISO/IEC: ISO/IEC 25012:2008 software engineering – software product quality requirements and evaluation (square) – data quality model. Technical report, ISO (2008)
Google Scholar
ISO/IEC: ISO/IEC 25010:2011 systems and software engineering – systems and software quality requirements and evaluation (square) – system and software quality models. Technical report, ISO (2011)
Google Scholar
Ros, R., Runeson, P.: Continuous experimentation and A/B testing: a mapping study. In: 2018 IEEE/ACM 4th International Workshop on Rapid Continuous Software Engineering (RCoSE), pp. 35–41. IEEE (2018)
Google Scholar
Auer, F., Felderer, M.: Current state of research on continuous experimentation: a systematic mapping study. In: 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 335–344. IEEE (2018)
Google Scholar
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2805–2824 (2019)
Article MathSciNet Google Scholar
Goebel, R., et al.: Explainable AI: the new 42? In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 295–303. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_21
Chapter Google Scholar
Bosch, J., Olsson, H.H., Crnkovic, I.: It takes three to tango: requirement, outcome/data, and AI driven development. In: SiBW, pp. 177–192 (2018)
Google Scholar
Fischer, L., et al.: Applying AI in practice: key challenges and lessons learned. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 451–471. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_25
Chapter Google Scholar
Marijan, D., Gotlieb, A., Ahuja, M.K.: Challenges of testing machine learning based systems. In: 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 101–102. IEEE (2019)
Google Scholar
ISO/IEC/IEEE international standard - software and systems engineering–software testing–part 4: Test techniques, pp. 1–149. ISO/IEC/IEEE 29119-4:2015 (2015)
Google Scholar
Xie, X., et al.: DeepHunter: a coverage-guided fuzz testing framework for deep neural networks. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, pp. 146–157. Association for Computing Machinery, New York (2019)
Google Scholar
Zhang, M., Zhang, Y., Zhang, L., Liu, C., Khurshid, S.: DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 132–142. IEEE (2018)
Google Scholar
Braiek, H.B., Khomh, F.: On testing machine learning programs. J. Syst. Softw. 164, 110542 (2020)
Article Google Scholar
Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The Oracle problem in software testing: a survey. IEEE Trans. Softw. Eng. 41(5), 507–525 (2015)
Article Google Scholar
Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Article Google Scholar
Dwarakanath, A., et al.: Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 118–128 (2018)
Google Scholar
Humble, J., Farley, D.: Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Pearson Education, London (2010)
Google Scholar
Garousi, V., Felderer, M.: Developing, verifying, and maintaining high-quality automated test scripts. IEEE Softw. 33(3), 68–75 (2016)
Article Google Scholar
Khritankov, A.: On feedback loops in lifelong machine learning systems. In: SWQD 2021. LNBIP, vol. 404, pp. 54–65. Springer, Cham (2021)
Google Scholar
Eberhardinger, B., Seebach, H., Knapp, A., Reif, W.: Towards testing self-organizing, adaptive systems. In: Merayo, M.G., de Oca, E.M. (eds.) ICTSS 2014. LNCS, vol. 8763, pp. 180–185. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44857-1_13
Chapter Google Scholar
Foidl, H., Felderer, M., Biffl, S.: Technical debt in data-intensive software systems. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 338–341. IEEE (2019)
Google Scholar
Foidl, H., Felderer, M.: Risk-based data validation in machine learning-based software systems. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, pp. 13–18 (2019)
Google Scholar
Kästner, C., Kang, E.: Teaching software engineering for AI-enabled systems. arXiv preprint arXiv:2001.06691 (2020)
Hulten, G.: Building Intelligent Systems. Springer, Berkeley (2018). https://doi.org/10.1007/978-1-4842-3432-7
Book Google Scholar

Download references

Acknowledgements

The research reported in this paper has been partly funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK), the Federal Ministry for Digital and Economic Affairs (BMDW), and the Province of Upper Austria in the frame of the COMET - Competence Centers for Excellent Technologies Programme managed by Austrian Research Promotion Agency FFG.

Author information

Authors and Affiliations

University of Innsbruck, Innsbruck, Austria
Michael Felderer
Software Competence Center Hagenberg GmbH (SCCH), Hagenberg im Mühlkreis, Austria
Rudolf Ramler

Authors

Michael Felderer
View author publications
You can also search for this author in PubMed Google Scholar
Rudolf Ramler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Felderer .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Dietmar Winkler
TU Wien, Vienna, Austria
Stefan Biffl
Blekinge Institute of Technology, Karlskrona, Sweden
Daniel Mendez
Johannes Kepler University Linz, Linz, Austria
Manuel Wimmer
Software Quality Lab GmbH, Linz, Austria
Johannes Bergsmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Felderer, M., Ramler, R. (2021). Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session). In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds) Software Quality: Future Perspectives on Software Engineering Quality. SWQD 2021. Lecture Notes in Business Information Processing, vol 404. Springer, Cham. https://doi.org/10.1007/978-3-030-65854-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-65854-0_3
Published: 06 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65853-3
Online ISBN: 978-3-030-65854-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics