Skip to main content
Log in

Machine learning application development: practitioners’ insights

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Nowadays, intelligent systems and services are getting increasingly popular as they provide data-driven solutions to diverse real-world problems, thanks to recent breakthroughs in artificial intelligence (AI) and machine learning (ML). However, machine learning meets software engineering not only with promising potentials but also with some inherent challenges. Despite some recent research efforts, we still do not have a clear understanding of the challenges of developing ML-based applications and the current industry practices. Moreover, it is unclear where software engineering researchers should focus their efforts to better support ML application developers. In this paper, we report about a survey that aimed to understand the challenges and best practices of ML application development. We synthesize the results obtained from 80 practitioners (with diverse skills, experience, and application domains) into 17 findings outlining challenges and best practices for ML application development. Practitioners involved in the development of ML-based software systems can leverage the summarized best practices to improve the quality of their system. We hope that the reported challenges will inform the research community about topics that need to be investigated to improve the engineering process and the quality of ML-based applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets generated during and/or analyzed during the current study are available in the replication package, at https://preview.tinyurl.com/ydaj9jh9

Notes

  1. www.atlassian.com/software/jira

  2. www.zenhub.com

  3. https://jupyter.org/

  4. https://www.elastic.co/kibana

  5. https://spark.apache.org/

  6. https://data-miner.io/

  7. https://www.featuretools.com/

  8. https://github.com/marcotcr/lime

  9. https://github.com/slundberg/shap

References

  • Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software Engineering for Machine Learning: A Case Study. ICSE: In Proc.

    Google Scholar 

  • Anderson, D. J. (2010). Kanban: successful evolutionary change for your technology business. Blue Hole Press.

  • Appendix. (2020). Replication package with survey data and results. Available online at: https://preview.tinyurl.com/ydaj9jh9

  • Bangash, A. A., Sahar, H., Chowdhury, S., Wong, A. W., Hindle, A., & Ali, K. (2019). What do developers know about machine learning: a study of ML discussions on StackOverflow.

  • Belani, H., Vukovic, M., & Car, Z. (2019). Requirements Engineering Challenges in Building AI-Based Complex Systems. arXiv preprint arXiv:1908.11791

  • Braiek, H. B., & Khomh, F. (2020). On Testing Machine Learning Programs. Journal of Systems and Software, 164, 110542, ISSN 0164–1212. https://doi.org/10.1016/j.jss.2020.110542

  • Charmaz, K. (2006). Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis. SAGE Publications.

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P.  (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.

  • Felderer, M., & Ramler, R. (2021). Quality Assurance for AI-Based Systems: Overview and Challenges In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds) Software Quality: Future Perspectives on Software Engineering Quality. SWQD, pp.33–42.

  • Fink, A. (2003) The survey handbook. Sage.

  • Grosse, R. B., & Duvenaud, D. K. (2014). Testing MCMC code. NIPS: In Proc.

    Google Scholar 

  • Guo, Q., Chen, S., Xie, X., Ma, L., Hu, Q., Liu, H., Liu, Y., Zhao, J., & Li, X. (2019). An Empirical Study towards Characterizing Deep Learning Development and Deployment across Different Frameworks and Platforms. arXiv preprint arXiv:1909.06727

  • He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning, 2008 IEEE International Joint Conference on Neural Networks, Hong Kong, 2008, pp. 1322-1328.

  • Huang, S., Liu, E. -H., Hui, Z. -W., Tang, S. -Q., & Zhang, S. -J. (2018). Challenges of Testing Machine Learning Applications arXiv:1806

  • Ishikawa, F., & Yoshioka, N. (2019). How do engineers perceive difficulties in engineering of machine-learning systems? questionnaire survey. In Proceedings of the Joint 7th International Workshop on Conducting Empirical Studies in Industry and 6th International Workshop on Software Engineering Research and Industrial Practice (CESSER-IP ’19). IEEE Press, 2–9.

  • Islam, Md. J., Nguyen, H. A., Pan, R., & Rajan, H. (2019). What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow. arXiv: 1906.11940v1

  • Khomh, F., & Antoniol, G. (2018). Bringing AI and machine learning data science into operation., Redhat Blog. Available at: https://www.redhat.com/en/blog/bringing-ai-and-machine-learning-data-science-operation

  • Khomh, F., Adams, B., Cheng, J., Fokaefs, M., & Antoniol, G. (2018). Software Engineering for Machine-Learning Applications: The Road Ahead. IEEE Software, 35(5), 81–84.

    Article  Google Scholar 

  • Kriens, P., & Verbelen, T. (2019). Software Engineering Practices for Machine Learning. arXiv:1906.10366

  • Ma, L., Juefei-Xu, F., Xue, M., Li, B., Li, L., Liu, Y., & Zhao, J. (2019). DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 614–618.

  • Ma, L., Juefei-Xu, F., Zhang, F., Sun, J., Xue, M., Li, B., Chen, C., Su, T., Li, L., Liu, Y., et al. (2018a). Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 120–131.

  • Ma, L., Zhang, F., Sun, J., Xue, M., Li, B., Juefei-Xu, F., Xie, C., Li, L., Liu, Y., Zhao, J., et al. (2018b). Deepmutation: Mutation testing of deep learning systems. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 100–111.

  • Marijan, D., & Gotlieb, A. (2020). Software testing for machine learning. Proceedings of the AAAI Conference on Artificial Intelligence, 34.

  • Marijan, D., Gotlieb, A., & Ahuja M. K. (2019). Challenges of Testing Machine Learning Based Systems.

  • Nguyen-Duc, A., Sundbø, I., Nascimento, E., Conte, T., Ahmed, I., & Abrahamsson, P. (2020). A Multiple Case Study of Artificial Intelligent System Development in Industry. In Proceedings of the Evaluation and Assessment in Software Engineering (EASE ’20), pp. 1–10.

  • Pei, K., Cao, Y., Yang, J., & Jana S. (2017). DeepXplore: Automated Whitebox Testing of Deep Learning Systems, In Proc. Symposium on Operating Systems Principles (SOSP ’17). pp.1-18.

  • Poppendieck, M., & Poppendieck, T. (2003). Lean Software Development: An Agile Toolkit: An Agile Toolkit. Addison-Wesley.

    Google Scholar 

  • Renggli, C., et al. (2019). Continuous integration of machine learning models with ease. ML/CI: Towards a rigorous yet practical treatment. arXiv:1903.00278

  • Responsible AI Practices. (2020). Google AI. Available at: https://ai.google/education/responsible-ai-practices

  • Sandberg, A. B., & Crnkovic, I. (2017). Meeting Industry-Academia Research Collaboration Challenges with Agile Methodologies. 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), Buenos Aires, pp. 73-82.

  • Schelter, S., Biessmann, F., Januschowski, T., Salinas, D., Seufert, S., & Szarvas, G. (2018). On Challenges in Machine Learning Model Management. Committee on Data Engineering: Bulletin of the IEEE CS Tech.

    Google Scholar 

  • Schwaber, Ken. (1997). Scrum development process (pp. 117–134). London: Business object design and implementation. Springer.

    Google Scholar 

  • Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J., & Dennison, D. (2015). Hidden technical debt in machine learning systems. In Proc NIPS. pp. 2503–2511.

  • Stol, K., Ralph, P., & Fitzgerald, B. (2016). Grounded Theory in Software Engineering Research: A Critical Review and Guidelines. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, pp. 120-131.

  • Storcheus, D., Rostamizadeh, A., & Kumar, S. (2015). A survey of modern questions and challenges in feature extraction. In Proc IWFE: Modern Questions and Challenges, NIPS. 1-18.

  • Sun, Y., Wu, M., Ruan, W., Huang, X., Kwiatkowska, M., & Kroening, D. (2018). Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 109–119.

  • van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605.

  • Vogelsang, A., & Borg, M. (2019). Requirements Engineering for Machine Learning: Perspectives from Data Scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), pp. 245-251. IEEE.

  • Wan, Z., Xia, X., Lo, D., & Murphy, G. C. (2019). How does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering.

  • Washizaki, H., Uchida, H., Khomh, F., & Guéhéneuc, Y. (2019). Studying Software Engineering Patterns for Designing Machine Learning Systems. 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP), Tokyo, Japan, pp. 49–495.

  • Zhang, J. M., Harman, M., Ma, L., & Liu, Y. (2019a). Machine Learning Testing: Survey, Landscapes and Horizons. arXiv preprint arXiv:1906.10742

  • Zhang, T., Gao, C., Ma, L., Lyu, M., & Kim, M. (2019b). An Empirical Study of Common Challenges in Developing Deep Learning Applications. 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pp. 104-115

  • Zhang, X., et al. (2019c). Software Engineering Practice in the Development of Deep Learning Applications. arXiv preprint arXiv:1910.03156

  • Zinkevich, M. (2018). Rules of machine learning: Best practices for ML engineering, Google guide on machine learning. Available at: https://developers.google.com/machine-learning/guides/rules-of-ml/

Download references

Acknowledgements

We express our gratitude to NSERC and FRQ funding agencies. Our heartiest thanks to the anonymous participants for their valuable time and thoughtful responses to our survey questionnaire.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md Saidur Rahman.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahman, M.S., Khomh, F., Hamidi, A. et al. Machine learning application development: practitioners’ insights. Software Qual J 31, 1065–1119 (2023). https://doi.org/10.1007/s11219-023-09621-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-023-09621-9

Keywords

Navigation