Automated Software Engineering

, Volume 23, Issue 4, pp 619–647 | Cite as

Identifying and understanding header file hotspots in C/C++ build processes

  • Shane McIntoshEmail author
  • Bram Adams
  • Meiyappan Nagappan
  • Ahmed E. Hassan


Software developers rely on a fast build system to incrementally compile their source code changes and produce modified deliverables for testing and deployment. Header files, which tend to trigger slow rebuild processes, are most problematic if they also change frequently during the development process, and hence, need to be rebuilt often. In this paper, we propose an approach that analyzes the build dependency graph (i.e., the data structure used to determine the minimal list of commands that must be executed when a source code file is modified), and the change history of a software system to pinpoint header file hotspots—header files that change frequently and trigger long rebuild processes. Through a case study on the GLib, PostgreSQL, Qt, and Ruby systems, we show that our approach identifies header file hotspots that, if improved, will provide greater improvement to the total future build cost of a system than just focusing on the files that trigger the slowest rebuild processes, change the most frequently, or are used the most throughout the codebase. Furthermore, regression models built using architectural and code properties of source files can explain 32–57 % of these hotspots, identifying subsystems that are particularly hotspot-prone and would benefit the most from architectural refinement.


Build systems Performance analysis Mining software repositories 


  1. Adams, B., De Schutter, K., Tromp, H., Meuter, W.: Design recovery and maintenance of build systems. In: Proceedings of the 23rd International Conference on Software Maintenance (ICSM), pp. 114–123 (2007)Google Scholar
  2. Adams, B., Schutter, KD., Tromp, H., Meuter, WD.: The evolution of the linux build system. In: Electronic Communications of the ECEASST 8 (2008)Google Scholar
  3. Adams, R., Tichy, W., Weinert, A.: The cost of selective recompilation and environment processing. Trans. Softw. Eng. Methodol. (TOSEM) 3(1), 3–28 (1994)CrossRefGoogle Scholar
  4. Al-Kofahi, J.M., Nguyen, H.V., Nguyen, A.T., Nguyen, T.T., Nguyen, T.N.: Detecting semantic changes in makefile build code. In: Proceedings of the 28th International Conference on Software Maintenance (ICSM), pp. 150–159 (2012)Google Scholar
  5. Cataldo, M., Mockus, A., Roberts, J.A., Herbsleb, J.D.: Software dependencies, work dependencies, and their impact on failures. Trans. Softw. Eng. (TSE) 35(6), 864–878 (2009)CrossRefGoogle Scholar
  6. Chambers, J.M., Hastie, T.J. (eds.): Statistical Models in S, vol. 4. Wadsworth and Brooks/Cole, Pacific Grove (1992)Google Scholar
  7. Dayani-Fard, H., Yu, Y., Mylopoulos, J., Andritsos, P.: Improving the build architecture of legacy C/C++ Software systems. In: Proceedings of the 8th International Conference on Fundamental Approaches to Software Engineering (FASE), pp. 96–110 (2005)Google Scholar
  8. Feldman, S.: Make: a program for maintaining computer programs. Software 9(4), 255–265 (1979)zbMATHGoogle Scholar
  9. Fischer, A.R.H., Blommaert, F.J.J., Midden, C.J.H.: Monitoring and evaluation of time delay. Int. J. Hum. Comput. Interact. 19(2), 163–180 (2005)CrossRefGoogle Scholar
  10. Fox, J.: Applied Regression Analysis and Generalized Linear Models, 2nd edn. Sage Publications, Thousand Oaks (2008)Google Scholar
  11. Hassan, A.E., Zhang, K.: Using decision trees to predict the certification result of a build. In: Proceedings of the 21st International Conference on Automated Software Engineering (ASE), pp. 189–198 (2006)Google Scholar
  12. Hochstein, L., Jiao, Y.: The cost of the build tax in scientific software. In: Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 384–387 (2011)Google Scholar
  13. Humble, J., Farley, D.: Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley, New Jersey (2010)Google Scholar
  14. Khomh, F., Chan, B., Zou, Y., Hassan, A.E.: An Entropy evaluation approach for triaging field crashes: a case study of mozilla firefox. In: Proceedings of the 18th Working Conference on Reverse Engineering (WCRE), pp. 261–270 (2011)Google Scholar
  15. Kumfert, G., Epperly, T.: Software in the DOE: the hidden overhead of “The Build”. Techical Report UCRL-ID-147343, Lawrence Livermore National Laboratory, CA, USA (2002)Google Scholar
  16. Kwan, I., Schröter, A., Damian, D.: Does socio-technical congruence have an effect on software build success? A study of coordination in a software project? Trans. Softw. Eng. (TSE) 37(3), 307–324 (2011)CrossRefGoogle Scholar
  17. Lakos, J.: Large-Scale C++ Software Design. Addison-Wesley, New Jersey (1996)Google Scholar
  18. McIntosh, S., Adams, B., Nguyen, T.H.D., Kamei, Y., Hassan, A.E.: An empirical study of build maintenance effort. In: Proceedings of the 33rd International Conference on Software Engineering (ICSE), pp. 141–150 (2011)Google Scholar
  19. McIntosh, S., Adams, B., Hassan, A.E.: The evolution of Java build systems. Empir. Softw. Eng. 17(4–5), 578–608 (2012)CrossRefGoogle Scholar
  20. McIntosh, S., Nagappan, M., Adams, B., Mockus, A., Hassan, A.E.: A large-scale empirical study of the relationship between build technology and build maintenance. Empir. Softw. Eng. (2015)Google Scholar
  21. Mockus, A.: Organizational volatility and its effects on software defects. In: Proceedings of the 18th Symposium on the Foundations of Software Engineering (FSE), pp. 117–126 (2010)Google Scholar
  22. Morgenthaler, J.D., Gridnev, M., Sauciuc, R., Bhansali, S.: Searching for build debt: experiences managing technical debt at google. In: Proceedings of the 3rd International Workshop on Managing Technical Debt (MTD), pp. 1–6 (2012)Google Scholar
  23. Nadi, S., Holt, R.: Make it or break it: mining anomalies in linux kbuild. In: Proceedings of the 18th Working Conference on Reverse Engineering (WCRE), pp. 315–324 (2011)Google Scholar
  24. Nadi, S., Holt, R.: Mining Kbuild to detect variability anomalies in linux. In: Proceedings of the 16th European Conference on Software Maintenance and Reengineering (CSMR), pp. 107–116 (2012)Google Scholar
  25. Nadi, S., Dietrich, C., Tartler, R., Holt, R.C., Lohmann, D.: Linux variability anomalies: what causes them and how do they get fixed? In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR), pp. 111–120 (2013)Google Scholar
  26. Neitsch, A., Wong, K., Godfrey, M.W.: Build system issues in multilanguage software. In: Proceedings of the 28th International Conference on Software Maintenance, pp. 140–149 (2012)Google Scholar
  27. R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.
  28. Shihab, E., Jiang, Z.M., Ibrahim, W.M., Adams, B., Hassan, A.E.: Understanding the Impact of code and process metrics on post-release defects: a case study on the eclipse project. In: Proceedings of the 4th International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–10 (2010)Google Scholar
  29. van der Storm, T.: Component-based configuration, integration and delivery. Ph.D Thesis, University of Amsterdam (2007)Google Scholar
  30. van der Storm, T.: Backtracking incremental continuous integration. In: Proceedings of the 12th European Conference on Software Maintenance and Reengineering (CSMR), pp. 233–242 (2008)Google Scholar
  31. Tamrawi, A., Nguyen, H.A., Nguyen, H.V., Nguyen, T.: Build Code analysis with symbolic evaluation. In: Proceedings of the 34th International Conference on Software Engineering (ICSE), pp. 650–660 (2012)Google Scholar
  32. Tu, Q., Godfrey, M.W.: The build-time software architecture view. In: Proceedings of the 17th International Conference on Software Maintenance (ICSM), pp. 398–407 (2001)Google Scholar
  33. Vakilian, M., Sauciuc, R., Morgenthaler, J.D., Mirrokni, V.: Automated decomposition of build targets. In: Proceedings of the 37th International Conference on Software Engineering (ICSE), pp. 123–133 (2015)Google Scholar
  34. Wolf, T., Schröter, A., Damian, D., Nguyen, T.: Predicting build failures using social network analysis on developer communication. In: Procedings of the 31st International Conference on Software Engineering (ICSE), pp. 1–11. Washington, DC (2009)Google Scholar
  35. Yu, Y., Dayani-Fard, H., Mylopoulos, J.: Removing false code dependencies to speedup software build processes. In: Proceedings of the 13th IBM Centre for Advanced Studies Conference (CASCON), pp. 343–352 (2003)Google Scholar
  36. Yu, Y., Dayani-Fard, H., Mylopoulos, J., Andritsos, P.: Reducing build time through precompilations for evolving large software. In: Proceedings of the 21st International Conference on Software Maintenance (ICSM), pp. 59–68 (2005)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Shane McIntosh
    • 1
    Email author
  • Bram Adams
    • 2
  • Meiyappan Nagappan
    • 3
  • Ahmed E. Hassan
    • 4
  1. 1.Department of Electrical and Computer Engineering McGill UniversityMontrealCanada
  2. 2.Lab on Maintenance, Construction, and Intelligence of Software (MCIS)Polytechnique MontréalMontrealCanada
  3. 3.Department of Software EngineeringRochester Institute of TechnologyRochesterUSA
  4. 4.Software Analysis and Intelligence Lab (SAIL)Queen’s UniversityKingstonCanada

Personalised recommendations