An empirical characterization of bad practices in continuous integration

  • Fiorella ZampettiEmail author
  • Carmine Vassallo
  • Sebastiano Panichella
  • Gerardo Canfora
  • Harald Gall
  • Massimiliano Di Penta


Continuous Integration (CI) has been claimed to introduce several benefits in software development, including high software quality and reliability. However, recent work pointed out challenges, barriers and bad practices characterizing its adoption. This paper empirically investigates what are the bad practices experienced by developers applying CI. The investigation has been conducted by leveraging semi-structured interviews of 13 experts and mining more than 2,300 Stack Overflow posts. As a result, we compiled a catalog of 79 CI bad smells belonging to 7 categories related to different dimensions of a CI pipeline management and process. We have also investigated the perceived importance of the identified bad smells through a survey involving 26 professional developers, and discussed how the results of our study relate to existing knowledge about CI bad practices. Whilst some results, such as the poor usage of branches, confirm existing literature, the study also highlights uncovered bad practices, e.g., related to static analysis tools or the abuse of shell scripts, and contradict knowledge from existing literature, e.g., about avoiding nightly builds. We discuss the implications of our catalog of CI bad smells for (i) practitioners, e.g., favor specific, portable tools over hacking, and do not ignore nor hide build failures, (ii) educators, e.g., teach CI culture, not just technology, and teach CI by providing examples of what not to do, and (iii) researchers, e.g., developing support for failure analysis, as well as automated CI bad smell detectors.


Continuous integration Empirical study Bad practices Survey Interview 



We would like to thank experts/developers involved in our interviews and those who participated in our online survey. Vassallo, Panichella, and Gall also acknowledge the Swiss National Science Foundation’s support for the project SURF-MobileAppsData (SNF Project No. 200021-166275).


  1. Abdalkareem R, Mujahid S, Shihab E, Rilling J (2019) Which commits can be CI skipped?. IEEE Trans Softw Eng:1–1.
  2. Amazon (2017) What is continuous delivery?
  3. Basili V R (1992) Software modeling and measurement: The goal/question/metric paradigm. Tech. rep. College ParkGoogle Scholar
  4. Beck K (2000) Extreme programming explained: embrace change. Addison-Wesley ProfessionalGoogle Scholar
  5. Bell J, Legunsen O, Hilton M, Eloussi L, Yung T, Marinov D (2018) Deflaker: automatically detecting flaky tests. In: Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, pp 433–444Google Scholar
  6. Bell J, Booch G (1991) Object Oriented Design: With Applications. Benjamin CummingsGoogle Scholar
  7. Bell J, Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol MeasGoogle Scholar
  8. Bell J, Deshpande A, Riehle D (2008) Continuous integration in open source software development. Open source development, communities and qualityGoogle Scholar
  9. Bell J, Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: A large-scale evaluation in open source software. In: IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)Google Scholar
  10. Bell J, Beller M, Gousios G, Zaidman A (2017) Oops, my tests broke the build: An explorative analysis of travis ci with github. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE PressGoogle Scholar
  11. Bell J, Chen L (2017) Continuous delivery: Overcoming adoption challenges. Journal of Systems and SoftwareGoogle Scholar
  12. Duvall P, Matyas S M, Glover A (2007) Continuous Integration: Improving Software Quality and Reducing Risk. Addison-WesleyGoogle Scholar
  13. Duvall PM (2010) Continuous integration. patterns and antipatterns. DZone refcard #84
  14. Duvall PM (2011) Continuous delivery: Patterns and antipatterns in the software life cycle. DZone refcard #145
  15. Fowler M, Beck K, Brant J (1999a) Refactoring: improving the design of existing code. Addison-WesleyGoogle Scholar
  16. Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999b) Refactoring: Improving the Design of Existing Code. Addison-Wesley ProfessionalGoogle Scholar
  17. Gallaba K, McIntosh S (2018) Use and misuse of continuous integration features: An empirical study of projects that (mis)use Travis CI. IEEE Transactions on Software Engineering (to appear):1–1.
  18. Ghaleb T A, da Costa D A, Zou Y (2019) An empirical study of the long duration of continuous integration builds. Empir Softw Eng 24(4):2102–2139CrossRefGoogle Scholar
  19. Goodman LA (1961) Snowball sampling. The annals of mathematical statisticsGoogle Scholar
  20. Hilton M, Tunnell T, Huang K, Marinov D, Dig D (2016) Usage, costs, and benefits of continuous integration in open-source projects. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE)Google Scholar
  21. Hilton M, Nelson N, Tunnell T, Marinov D, Dig D (2017) Trade-offs In continuous integration: Assurance, security, and flexibility. In: Proceedings of the 25th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSEGoogle Scholar
  22. Humble J, Farley D (2010) Continuous delivery: Reliable Software Releases Through Build, Test, and Deployment Automation. Addison-Wesley ProfessionalGoogle Scholar
  23. Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs? In: 2013 35th International Conference on Software Engineering (ICSE). IEEEGoogle Scholar
  24. Kerzazi N, Khomh F, Adams B (2014) Why do automated builds break? an empirical study In: 30th IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEEGoogle Scholar
  25. Krippendorff K (1980) Content analysis: An introduction to its methodology. SageGoogle Scholar
  26. Luo Q, Hariri F, Eloussi L, Marinov D (2014) An empirical analysis of flaky tests. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, pp 643–653Google Scholar
  27. Orellana MG, Cordero AM, Laghari G, Demeyer S (2017) On the differences between unit and integration testing in the travistorrent dataset. In: Proceedings of the 14th working conference on mining software repositoriesGoogle Scholar
  28. McIntosh S, Adams B, Nguyen T H, Kamei Y, Hassan AE (2011) An empirical study of build maintenance effort. In: Proceedings of the Int’l Conference on Software Engineering (ICSE)Google Scholar
  29. Moreno L, Bavota G, Di Penta M, Oliveto R, Marcus A, Canfora G (2017) ARENA: an approach for the automated generation of release notes, vol 43CrossRefGoogle Scholar
  30. Olsson HH, Alahyari H, Bosch J (2012) Climbing the ”stairway to heaven” – a mulitiple-case study exploring barriers in the transition from agile development towards continuous deployment of software. In: Proceedings of the 2012 38th Euromicro Conference on Software Engineering and Advanced Applications, SEAA ’12Google Scholar
  31. Oppenheim B (1992) Questionnaire Design, Interviewing and Attitude Measurement. Pinter PublishersGoogle Scholar
  32. Palomba F, Zaidman A (2017) Does refactoring of test smells induce fixing flaky tests?. In: 2017 IEEE International conference on software maintenance and evolution, ICSME 2017, shanghai, China, pp 1–12Google Scholar
  33. Potdar A, Shihab E (2014) An exploratory study on self-admitted technical debt. In: 30th IEEE International Conference on Software Maintenance and EvolutionGoogle Scholar
  34. Rahman MT, Querel LP, Rigby PC, Adams B (2016) Feature toggles: practitioner practices and a case study. In: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEEGoogle Scholar
  35. Rahman A, Parnin C, Williams L (2019) The seven sins: security smells in infrastructure as code scripts. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, pp 164–175Google Scholar
  36. Rastkar S, Murphy G C, Murray G (2014) Automatic summarization of bug reports. IEEE Trans Softw Eng 40(4):366–380CrossRefGoogle Scholar
  37. Savor T, Douglas M, Gentili M, Williams L, Beck K, Stumm M (2016) Continuous deployment at facebook and OANDA. In: Companion proceedings of the 38th International Conference on Software Engineering (ICSE Companion)Google Scholar
  38. Seo H, Sadowski C, Elbaum S G, Aftandilian E, Bowdidge R W (2014) Programmers’ build errors: a case study (at Google). In: Proceedings of Int’l Conf on Software Engineering (ICSE)Google Scholar
  39. Spencer D (2009) Card sorting: Designing usable categories. Rosenfeld MediaGoogle Scholar
  40. Ståhl D, Bosch J (2014a) Automated software integration flows in industry: a multiple-case study. In: Companion Proceedings of the 36th International Conference on Software Engineering. ACMGoogle Scholar
  41. Ståhl D, Bosch J (2014b) Modeling continuous integration practice differences in industry software development. J Syst SoftwGoogle Scholar
  42. Thorve S, Sreshtha C, Meng N (2018) An empirical study of flaky tests in android apps. In: 2018 IEEE International conference on software maintenance and evolution, ICSME 2018, Madrid, Spain, pp 534–538Google Scholar
  43. van Deursen A, Moonen L, Bergh A, Kok G (2001) Refactoring test code. In: Proceedings of the 2nd International Conference on Extreme Programming and Flexible Processes in Software Engineering (XP)Google Scholar
  44. Vasilescu B, Yu Y, Wang H, Devanbu P, Filkov V (2015) Quality and productivity outcomes relating to continuous integration in github. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACMGoogle Scholar
  45. Vassallo C, Zampetti F, Romano D, Beller M, Panichella A, Di Penta M, Zaidman A (2016) Continuous delivery practices in a large financial organization. In: 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME)Google Scholar
  46. Vassallo C, Schermann G, Zampetti F, Romano D, Leitner P, Zaidman A, Di Penta M, Panichella S (2017) A tale of ci build failures: an open source and a financial organization perspective. In: 2017 IEEE International Conference on Software maintenance and evolution (ICSME). IEEEGoogle Scholar
  47. Vassallo C, Proksch S, Gall H, Di Penta M (2019a) Automated reporting of anti-patterns and decay in continuous integration. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019. IEEE, Montreal, pp (to appear)Google Scholar
  48. Vassallo C, Proksch S, Zemp T, Gall HC (2019b) Every build you break: Developer-oriented assistance for build failure resolution. Empirical Software Engineering (To appear)Google Scholar
  49. Wedyan F, Alrmuny D, Bieman J M (2009) The effectiveness of automated static analysis tools for fault detection and refactoring prediction. In: Second international conference on software testing verification and validation, ICST 2009, Denver, Colorado, pp 141–150Google Scholar
  50. Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE PressGoogle Scholar
  51. Zampetti F, Vassallo C, Panichella S, Canfora G, Gall H, Di Penta M (2019) An empirical characterization of bad practices in continuous delivery (online appendix). Technical report,

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.University of SannioBeneventoItaly
  2. 2.University of ZurichZurichSwitzerland
  3. 3.Zurich University of Applied SciencesWinterthurSwitzerland

Personalised recommendations