An empirical characterization of bad practices in continuous integration


Continuous Integration (CI) has been claimed to introduce several benefits in software development, including high software quality and reliability. However, recent work pointed out challenges, barriers and bad practices characterizing its adoption. This paper empirically investigates what are the bad practices experienced by developers applying CI. The investigation has been conducted by leveraging semi-structured interviews of 13 experts and mining more than 2,300 Stack Overflow posts. As a result, we compiled a catalog of 79 CI bad smells belonging to 7 categories related to different dimensions of a CI pipeline management and process. We have also investigated the perceived importance of the identified bad smells through a survey involving 26 professional developers, and discussed how the results of our study relate to existing knowledge about CI bad practices. Whilst some results, such as the poor usage of branches, confirm existing literature, the study also highlights uncovered bad practices, e.g., related to static analysis tools or the abuse of shell scripts, and contradict knowledge from existing literature, e.g., about avoiding nightly builds. We discuss the implications of our catalog of CI bad smells for (i) practitioners, e.g., favor specific, portable tools over hacking, and do not ignore nor hide build failures, (ii) educators, e.g., teach CI culture, not just technology, and teach CI by providing examples of what not to do, and (iii) researchers, e.g., developing support for failure analysis, as well as automated CI bad smell detectors.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. 1.

  2. 2.

    We have queried the SO database in May 2017

  3. 3.

  4. 4.

    The sum is > 26 as multiple build automation tools may be used.

  5. 5.

    This is beyond test suite optimization, which is an important problem in testing, but out of scope for this investigation.

  6. 6.

  7. 7.


  1. Abdalkareem R, Mujahid S, Shihab E, Rilling J (2019) Which commits can be CI skipped?. IEEE Trans Softw Eng:1–1.

  2. Amazon (2017) What is continuous delivery?

  3. Basili V R (1992) Software modeling and measurement: The goal/question/metric paradigm. Tech. rep. College Park

  4. Beck K (2000) Extreme programming explained: embrace change. Addison-Wesley Professional

  5. Bell J, Legunsen O, Hilton M, Eloussi L, Yung T, Marinov D (2018) Deflaker: automatically detecting flaky tests. In: Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, pp 433–444

  6. Bell J, Booch G (1991) Object Oriented Design: With Applications. Benjamin Cummings

  7. Bell J, Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas

  8. Bell J, Deshpande A, Riehle D (2008) Continuous integration in open source software development. Open source development, communities and quality

  9. Bell J, Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: A large-scale evaluation in open source software. In: IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)

  10. Bell J, Beller M, Gousios G, Zaidman A (2017) Oops, my tests broke the build: An explorative analysis of travis ci with github. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE Press

  11. Bell J, Chen L (2017) Continuous delivery: Overcoming adoption challenges. Journal of Systems and Software

  12. Duvall P, Matyas S M, Glover A (2007) Continuous Integration: Improving Software Quality and Reducing Risk. Addison-Wesley

  13. Duvall PM (2010) Continuous integration. patterns and antipatterns. DZone refcard #84

  14. Duvall PM (2011) Continuous delivery: Patterns and antipatterns in the software life cycle. DZone refcard #145

  15. Fowler M, Beck K, Brant J (1999a) Refactoring: improving the design of existing code. Addison-Wesley

  16. Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999b) Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional

  17. Gallaba K, McIntosh S (2018) Use and misuse of continuous integration features: An empirical study of projects that (mis)use Travis CI. IEEE Transactions on Software Engineering (to appear):1–1.

    Article  Google Scholar 

  18. Ghaleb T A, da Costa D A, Zou Y (2019) An empirical study of the long duration of continuous integration builds. Empir Softw Eng 24(4):2102–2139

    Article  Google Scholar 

  19. Goodman LA (1961) Snowball sampling. The annals of mathematical statistics

  20. Hilton M, Tunnell T, Huang K, Marinov D, Dig D (2016) Usage, costs, and benefits of continuous integration in open-source projects. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE)

  21. Hilton M, Nelson N, Tunnell T, Marinov D, Dig D (2017) Trade-offs In continuous integration: Assurance, security, and flexibility. In: Proceedings of the 25th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE

  22. Humble J, Farley D (2010) Continuous delivery: Reliable Software Releases Through Build, Test, and Deployment Automation. Addison-Wesley Professional

  23. Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs? In: 2013 35th International Conference on Software Engineering (ICSE). IEEE

  24. Kerzazi N, Khomh F, Adams B (2014) Why do automated builds break? an empirical study In: 30th IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE

  25. Krippendorff K (1980) Content analysis: An introduction to its methodology. Sage

  26. Luo Q, Hariri F, Eloussi L, Marinov D (2014) An empirical analysis of flaky tests. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, pp 643–653

  27. Orellana MG, Cordero AM, Laghari G, Demeyer S (2017) On the differences between unit and integration testing in the travistorrent dataset. In: Proceedings of the 14th working conference on mining software repositories

  28. McIntosh S, Adams B, Nguyen T H, Kamei Y, Hassan AE (2011) An empirical study of build maintenance effort. In: Proceedings of the Int’l Conference on Software Engineering (ICSE)

  29. Moreno L, Bavota G, Di Penta M, Oliveto R, Marcus A, Canfora G (2017) ARENA: an approach for the automated generation of release notes, vol 43

    Article  Google Scholar 

  30. Olsson HH, Alahyari H, Bosch J (2012) Climbing the ”stairway to heaven” – a mulitiple-case study exploring barriers in the transition from agile development towards continuous deployment of software. In: Proceedings of the 2012 38th Euromicro Conference on Software Engineering and Advanced Applications, SEAA ’12

  31. Oppenheim B (1992) Questionnaire Design, Interviewing and Attitude Measurement. Pinter Publishers

  32. Palomba F, Zaidman A (2017) Does refactoring of test smells induce fixing flaky tests?. In: 2017 IEEE International conference on software maintenance and evolution, ICSME 2017, shanghai, China, pp 1–12

  33. Potdar A, Shihab E (2014) An exploratory study on self-admitted technical debt. In: 30th IEEE International Conference on Software Maintenance and Evolution

  34. Rahman MT, Querel LP, Rigby PC, Adams B (2016) Feature toggles: practitioner practices and a case study. In: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE

  35. Rahman A, Parnin C, Williams L (2019) The seven sins: security smells in infrastructure as code scripts. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, pp 164–175

  36. Rastkar S, Murphy G C, Murray G (2014) Automatic summarization of bug reports. IEEE Trans Softw Eng 40(4):366–380

    Article  Google Scholar 

  37. Savor T, Douglas M, Gentili M, Williams L, Beck K, Stumm M (2016) Continuous deployment at facebook and OANDA. In: Companion proceedings of the 38th International Conference on Software Engineering (ICSE Companion)

  38. Seo H, Sadowski C, Elbaum S G, Aftandilian E, Bowdidge R W (2014) Programmers’ build errors: a case study (at Google). In: Proceedings of Int’l Conf on Software Engineering (ICSE)

  39. Spencer D (2009) Card sorting: Designing usable categories. Rosenfeld Media

  40. Ståhl D, Bosch J (2014a) Automated software integration flows in industry: a multiple-case study. In: Companion Proceedings of the 36th International Conference on Software Engineering. ACM

  41. Ståhl D, Bosch J (2014b) Modeling continuous integration practice differences in industry software development. J Syst Softw

  42. Thorve S, Sreshtha C, Meng N (2018) An empirical study of flaky tests in android apps. In: 2018 IEEE International conference on software maintenance and evolution, ICSME 2018, Madrid, Spain, pp 534–538

  43. van Deursen A, Moonen L, Bergh A, Kok G (2001) Refactoring test code. In: Proceedings of the 2nd International Conference on Extreme Programming and Flexible Processes in Software Engineering (XP)

  44. Vasilescu B, Yu Y, Wang H, Devanbu P, Filkov V (2015) Quality and productivity outcomes relating to continuous integration in github. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM

  45. Vassallo C, Zampetti F, Romano D, Beller M, Panichella A, Di Penta M, Zaidman A (2016) Continuous delivery practices in a large financial organization. In: 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME)

  46. Vassallo C, Schermann G, Zampetti F, Romano D, Leitner P, Zaidman A, Di Penta M, Panichella S (2017) A tale of ci build failures: an open source and a financial organization perspective. In: 2017 IEEE International Conference on Software maintenance and evolution (ICSME). IEEE

  47. Vassallo C, Proksch S, Gall H, Di Penta M (2019a) Automated reporting of anti-patterns and decay in continuous integration. In: Proceedings of the 41st International Conference on Software Engineering, ICSE 2019. IEEE, Montreal, pp (to appear)

  48. Vassallo C, Proksch S, Zemp T, Gall HC (2019b) Every build you break: Developer-oriented assistance for build failure resolution. Empirical Software Engineering (To appear)

  49. Wedyan F, Alrmuny D, Bieman J M (2009) The effectiveness of automated static analysis tools for fault detection and refactoring prediction. In: Second international conference on software testing verification and validation, ICST 2009, Denver, Colorado, pp 141–150

  50. Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE Press

  51. Zampetti F, Vassallo C, Panichella S, Canfora G, Gall H, Di Penta M (2019) An empirical characterization of bad practices in continuous delivery (online appendix). Technical report,

Download references


We would like to thank experts/developers involved in our interviews and those who participated in our online survey. Vassallo, Panichella, and Gall also acknowledge the Swiss National Science Foundation’s support for the project SURF-MobileAppsData (SNF Project No. 200021-166275).

Author information



Corresponding author

Correspondence to Fiorella Zampetti.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by: Christoph Treude

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zampetti, F., Vassallo, C., Panichella, S. et al. An empirical characterization of bad practices in continuous integration. Empir Software Eng 25, 1095–1135 (2020).

Download citation


  • Continuous integration
  • Empirical study
  • Bad practices
  • Survey
  • Interview