Skip to main content

The AIQ Meta-Testbed: Pragmatically Bridging Academic AI Testing and Industrial Q Needs

Part of the Lecture Notes in Business Information Processing book series (LNBIP,volume 404)


AI solutions seem to appear in any and all application domains. As AI becomes more pervasive, the importance of quality assurance increases. Unfortunately, there is no consensus on what artificial intelligence means and interpretations range from simple statistical analysis to sentient humanoid robots. On top of that, quality is a notoriously hard concept to pinpoint. What does this mean for AI quality? In this paper, we share our working definition and a pragmatic approach to address the corresponding quality assurance with a focus on testing. Finally, we present our ongoing work on establishing the AIQ Meta-Testbed.


  • Artificial intelligence
  • Machine learning
  • Quality assurance
  • Software testing
  • Testbed

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

  2. 2.

  3. 3.

  4. 4.

    Well aware of the two previous “AI winters”, periods with less interest and funding due to inflated expectations.

  5. 5.

  6. 6.


  1. Lipson, H., Kurman, M.: Driverless: Intelligent Cars and the Road Ahead. MIT Press, Cambridge (2016)

    Google Scholar 

  2. Jiang, F., et al.: Artificial intelligence in healthcare: past, present and future. Stroke Vasc. Neurol. 2(4), 230–243 (2017)

    CrossRef  Google Scholar 

  3. Walkinshaw, N.: Software Quality Assurance: Consistency in the Face of Complexity and Change. Springer, Heidelberg (2017).

    CrossRef  Google Scholar 

  4. Borg, M., et al.: Safely entering the deep: a review of verification and validation for machine learning and a challenge elicitation in the automotive industry. J. Autom. Softw. Eng. 1(1), 1–19 (2019)

    CrossRef  Google Scholar 

  5. Salay, R., Queiroz, R., Czarnecki, K.: An Analysis of ISO 26262: Machine Learning and Safety in Automotive Software. SAE Technical Paper 2018–01-1075 (2018)

    Google Scholar 

  6. Azulay, A., Weiss, Y.: Why do deep convolutional networks generalize so poorly to small image transformations? J. Mach. Learn. Res. 20, 25 (2019)

    MathSciNet  MATH  Google Scholar 

  7. Schulmeyer, G.: Handbook Of Software Quality Assurance, 1st edn. Prentice Hall, Upper Saddle River (1987)

    Google Scholar 

  8. Galin, D.: Software Quality Assurance: From Theory to Implementation. Pearson, Harlow (2003)

    Google Scholar 

  9. Mistrik, I., Soley, R.M., Ali, N., Grundy, J., Tekinerdogan, B. (eds.): Software Quality Assurance: In Large Scale and Complex Software-Intensive Systems. Morgan Kaufmann, Waltham (2016)

    Google Scholar 

  10. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)

    CrossRef  Google Scholar 

  11. Borg, M.: Explainability first! Cousteauing the depths of neural networks to argue safety. In: Greenyer, J., Lochau, M., Vogel, T., (eds.) Explainable Software for Cyber-Physical Systems (ES4CPS): Report from the GI Dagstuhl Seminar 19023, pp. 26–27 (2019)

    Google Scholar 

  12. Vogelsang, A., Borg, M.: Requirements engineering for machine learning: perspectives from data scientists. In: Proceedings of the 27th International Requirements Engineering Conference Workshops, pp. 245–251 (2019)

    Google Scholar 

  13. Weyns, D., et al.: A survey of formal methods in self-adaptive systems. In: Proceedings of the 5th International Conference on Computer Science and Software Engineering, pp. 67–79 (2012)

    Google Scholar 

  14. Gonzalez, C.A., Cabot, J.: Formal verification of static software models in MDE: a systematic review. Inf. Softw. Tech. 56(8), 821–838 (2014)

    CrossRef  Google Scholar 

  15. Herbsleb, J., et al.: Software quality and the capability maturity model. Commun. ACM 40(6), 30–40 (1997)

    CrossRef  Google Scholar 

  16. Ashrafi, N.: The impact of software process improvement on quality: theory and practice. Inf. Manag. 40(7), 677–690 (2003)

    CrossRef  Google Scholar 

  17. Gelperin, D., Hetzel, B.: The growth of software testing. Commun. ACM 31(6), 687–695 (1988)

    CrossRef  Google Scholar 

  18. Orso, A., Rothermel, G.: Software testing: a research travelogue (2000–2014). In: Future of Software Engineering Proceedings, pp. 117–132 (2014)

    Google Scholar 

  19. Kassab, M., DeFranco, J.F., Laplante, P.A.: Software testing: the state of the practice. IEEE Softw. 34(5), 46–52 (2017)

    CrossRef  Google Scholar 

  20. Hulten, G.: Building Intelligent Systems: A Guide to Machine Learning Engineering, 1st edn. Apress, New York (2018)

    CrossRef  Google Scholar 

  21. Kästner, C., Kang, E.: Teaching Software Engineering for AI-Enabled Systems. arXiv:2001.06691 [cs], January 2020

  22. Serban, A., van der Blom, K., Hoos, H., Visser, J.: Adoption and effects of software engineering best practices in machine learning. In: Proceedings of the 14th International Symposium on Empirical Software Engineering and Measurement (2020)

    Google Scholar 

  23. Bosch, J., Crnkovic, I., Olsson, H.H.: Engineering AI Systems: A Research Agenda. arXiv:2001.07522 [cs], January 2020

  24. Zhang, J.M., et al.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. (2020). (Early Access)

    Google Scholar 

  25. Vincenzo, R., Jahangirova, G., Stocco, A., Humbatova, N., Weiss, M., Tonella, P.: Testing machine learning based systems: a systematic mapping. Empirical Softw. Eng. 25, 5193–5254 (2020)

    CrossRef  Google Scholar 

  26. Schallmo, D.R.A., Williams, C.A.: History of digital transformation. Digital Transformation Now!. SB, pp. 3–8. Springer, Cham (2018).

    CrossRef  Google Scholar 

  27. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Pearson, Upper Saddle River (2009)

    MATH  Google Scholar 

  28. Cai, K.Y.: Optimal software testing and adaptive software testing in the context of software cybernetics. Inf. Softw. Technol. 44(14), 841–855 (2002)

    CrossRef  Google Scholar 

  29. Mahdavi-Hezavehi, S., et al.: A systematic literature review on methods that handle multiple quality attributes in architecture-based self-adaptive systems. Inf. Softw. Technol. 90, 1–26 (2017)

    CrossRef  Google Scholar 

  30. Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2503–2511 (2015)

    Google Scholar 

  31. Humbatova, N., Jahangirova, G., Bavota, G., Riccio, V., Stocco, A., Tonella, P.: Taxonomy of real faults in deep learning systems. In: Proceedings of the 42nd International Conference on Software Engineering (2020)

    Google Scholar 

  32. Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2016)

    CrossRef  Google Scholar 

  33. Felderer, M., Russo, B., Auer, F.: On testing data-intensive software systems. Security and Quality in Cyber-Physical Systems Engineering, pp. 129–148. Springer, Cham (2019).

    CrossRef  Google Scholar 

  34. Basili, V., Selby, R.: Comparing the effectiveness of software testing strategies. IEEE Trans. Softw. Eng. SE–13(12), 1278–1296 (1987)

    CrossRef  Google Scholar 

  35. Zhu, Q., Panichella, A., Zaidman, A.: A systematic literature review of how mutation testing supports quality assurance processes. Softw. Test. Verif. Reliab. 28(6), e1675 (2018)

    CrossRef  Google Scholar 

  36. Erich, F., Amrit, C., Daneva, M.: A qualitative study of DevOps usage in practice. J. Softw. Evol. Process 29(6), e1885 (2017)

    CrossRef  Google Scholar 

  37. Karamitsos, I., Albarhami, S., Apostolopoulos, C.: Applying DevOps practices of continuous automation for machine learning. Information 11(7), 363 (2020)

    CrossRef  Google Scholar 

Download references


This work was funded by Plattformen at Campus Helsingborg, Lund University.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Markus Borg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Borg, M. (2021). The AIQ Meta-Testbed: Pragmatically Bridging Academic AI Testing and Industrial Q Needs. In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds) Software Quality: Future Perspectives on Software Engineering Quality. SWQD 2021. Lecture Notes in Business Information Processing, vol 404. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-65853-3

  • Online ISBN: 978-3-030-65854-0

  • eBook Packages: Computer ScienceComputer Science (R0)