Skip to main content

Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 11915)

Abstract

[Context/Background] Machine Learning (ML) software has special ability for increasing technical debt due to ML-specific issues besides having all the problems of regular code. The term “Hidden Technical Debt” (HTD) was coined by Sculley et al. to address maintainability issues in ML software as an analogy to technical debt in traditional software. [Goal] The aim of this paper is to empirically analyse how HTD patterns emerge during the early development phase of ML software, namely the prototyping phase. [Method] Therefore, we conducted a case study with subject systems as ML models planned to be integrated into the software system owned by Västtrafik, the public transportation agency in the west area of Sweden. [Results] During our case study, we could detect HTD patterns, which have the potential to emerge in ML prototypes, except for “Legacy Features”, “Correlated features”, and “Plain Old Data Type Smell”. [Conclusion] Preliminary results indicate that emergence of significant amount of HTD patterns can occur during prototyping phase. However, generalizability of our results require analyses of further ML systems from various domains.

Keywords

  • Machine learning
  • Software maintainability
  • Hidden Technical Debt

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-35333-9_14
  • Chapter length: 8 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-35333-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.

Notes

  1. 1.

    https://github.com/gulcalikli/ProfesShortPaper.

References

  1. Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Proceedings of NIPS 2015, pp. 2503–2511. MIT Press, Montreal (2015)

    Google Scholar 

  2. Gomez-Uribe, C.A., Hunt, N.: The netflix recommender system: algorithms, business value, and innovation. ACM Trans. Manag. Inf. Syst. 6(4), 13:1–13:19 (2016)

    Google Scholar 

  3. Kenthapadi, K., Le, B., Venkataraman, G.: Personalized job recommendation system at LinkedIn: practical challenges and lessons learned. In: Proceedings of 11th ACM Conference on Recommender Systems, Como, Italy, pp. 346–347 (2017)

    Google Scholar 

  4. Hazelwood, K., et al.: Applied machine learning at facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture Proceedings, pp. 620–629, Vienna, Austria (2018)

    Google Scholar 

  5. Martinez-Plumed, F., et al.: Accounting for the neglected dimensions of AI progress (2018). https://arxiv.org/abs/1806.00610

  6. Agarwal, A., et al.: Making contextual decisions with low technical debt (2017). https://arxiv.org/abs/1806

  7. Breck, E., Cai, S., Nielsen, E., Salib, M., Sculley, D.: The ML test score: a rubric for ML production readiness and technical debt reduction. In: BigData 2018, pp. 1123–1133. IEEE, Boston (2017)

    Google Scholar 

  8. Kim, J., Feldt, R., Yoo, S.: Guiding deep learning testing using surprise adequacy. In: ICSE 2019, pp. 303–314. IEEE, Montreal (2019)

    Google Scholar 

  9. Balzer, P.: Prediction of car park occupancy. http://mechlab-engineering.de/2015/03/vorhersage-derparkhausbelegung-mit-offenen-daten/. Accessed May 2019

  10. Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Springer, New York (1998). https://doi.org/10.1007/978-1-4615-5689-3

    CrossRef  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mohannad Alahdab or Gül Çalıklı .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Alahdab, M., Çalıklı, G. (2019). Empirical Analysis of Hidden Technical Debt Patterns in Machine Learning Software. In: Franch, X., Männistö, T., Martínez-Fernández, S. (eds) Product-Focused Software Process Improvement. PROFES 2019. Lecture Notes in Computer Science(), vol 11915. Springer, Cham. https://doi.org/10.1007/978-3-030-35333-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35333-9_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35332-2

  • Online ISBN: 978-3-030-35333-9

  • eBook Packages: Computer ScienceComputer Science (R0)