Empirical Software Engineering

  • Yann-Gaël GuéhéneucEmail author
  • Foutse Khomh


Software engineering as a discipline exists since the 1960s, when participants of the NATO Software Engineering Conference in 1968 at Garmisch, Germany, recognised that there was a “software crisis” due to the increased complexity of the systems and of the software running (on) these systems. The software crisis led to the acknowledgement that software engineering is more than computing theories and efficiency of code and that it requires dedicated research. Thus, this crisis was the starting point of software engineering research. Software engineering research acknowledged early that software engineering is fundamentally an empirical discipline, thus further distinguishing computer science from software engineering, because (1) software is immaterial and does not obey physical laws and (2) software is written by people for people. In this chapter, we first introduce the concepts and principles on which empirical software engineering is based. Then, using these concepts and principles, we describe seminal works that led to the inception and popularisation of empirical software engineering research. We use these seminal works to discuss some idioms, patterns, and styles in empirical software engineering before discussing some challenges that empirical software engineering must overcome in the (near) future. Finally, we conclude and suggest further readings and future directions.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The authors received the amazing support, suggestions, and corrections from many colleagues and students, including but not limited to Mona Abidi, Giuliano Antoniol, Sung-deok Cha, Massimiliano Di Penta, Manel Grichi, Kyo Kang, Rubén Saborido-Infantes, and Audrey W.J. Wong. Obviously, any errors remaining in this chapter are solely due to the authors.


  1. 1.
    Alberto Espinosa, J., Kraut, R.E.: Kogod school of Business, and theme organization. Shared mental models, familiarity and coordination: a multi-method study of distributed software teams. In: International Conference Information Systems, pp. 425–433 (2002)Google Scholar
  2. 2.
    Antoniol, G., Ayari, K., Di Penta, M., Khomh, F., Guéhéneuc, Y.-G.: Is it a bug or an enhancement?: a text-based approach to classify change requests. In: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON ’08, pp. 23:304–23:318. ACM, New York (2008)Google Scholar
  3. 3.
    Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: 2011 33rd International Conference on Software Engineering (ICSE), pp. 1–10 (2011)Google Scholar
  4. 4.
    Asaduzzaman, M., Roy, C.K., Schneider, K.A., Penta, M.D.: Lhdiff: a language-independent hybrid approach for tracking source code lines. In: 2013 29th IEEE International Conference on Software Maintenance (ICSM), pp. 230–239 (2013)Google Scholar
  5. 5.
    Basili, V.R., Weiss, D.M.: A methodology for collecting valid software engineering data. IEEE Trans. Softw. Eng. SE-10(6), 728–738 (1984)CrossRefGoogle Scholar
  6. 6.
    Beckwith, L., Burnett, M.: Gender: an important factor in end-user programming environments? In: 2004 IEEE Symposium on Visual Languages and Human Centric Computing, pp. 107–114 (2004)Google Scholar
  7. 7.
    Bellon, S., Koschke, R., Antoniol, G., Krinke, J., Merlo, E.: Comparison and evaluation of clone detection tools. IEEE Trans. Softw. Eng. 33(9), 577–591 (2007)CrossRefGoogle Scholar
  8. 8.
    Bird, C., Zimmermann, T.: Assessing the value of branches with what-if analysis. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE ’12, pp. 45:1–45:11. ACM, New York (2012)Google Scholar
  9. 9.
    Bird, C., Bachmann, A., Aune, E., Duffy, J., Bernstein, A., Filkov, V., Devanbu, P.: Fair and balanced?: Bias in bug-fix datasets. In: Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE ’09, pp. 121–130. ACM, New York (2009)Google Scholar
  10. 10.
    Bird, C., Menzies, T., Zimmermann, T.: The Art and Science of Analyzing Software Data. Elsevier Science, Amsterdam (2015)Google Scholar
  11. 11.
    Burkhard, D.L., Jenster, P.V.: Applications of computer-aided software engineering tools: survey of current and prospective users. SIGMIS Database 20(3), 28–37 (1989)CrossRefGoogle Scholar
  12. 12.
    Carver, J., Jaccheri, L., Morasca, S., Shull, F.: Issues in using students in empirical studies in software engineering education. In: Ninth International Software Metrics Symposium, 2003. Proceedings, pp. 239–249 (2003)Google Scholar
  13. 13.
    Cepeda Porras, G., Guéhéneuc, Y.-G.: An empirical study on the efficiency of different design pattern representations in UML class diagrams. Empir. Softw. Eng. 15(5), 493–522 (2010)CrossRefGoogle Scholar
  14. 14.
    Curtis, B., Krasner, H., Iscoe, N.: A field study of the software design process for large systems. Commun. ACM 31(11), 1268–1287 (1988)CrossRefGoogle Scholar
  15. 15.
    Devanbu, P., Zimmermann, T., Bird, C.: Belief & evidence in empirical software engineering. In: Proceedings of the 38th International Conference on Software Engineering (2016)Google Scholar
  16. 16.
    Easterbrook, S., Singer, J., Storey, M.-A., Damian, D.: Selecting empirical methods for software engineering research. In: Guide to Advanced Empirical Software Engineering, pp. 285–311. Springer, London (2008)CrossRefGoogle Scholar
  17. 17.
    Endres, A., Rombach, H.D.: A Handbook of Software and Systems Engineering: Empirical Observations, Laws, and Theories. Fraunhofer IESE Series on Software Engineering. Pearson/Addison Wesley, Boston (2003)Google Scholar
  18. 18.
    Frazier, T.P., Bailey, J.W., Corso, M.L.: Comparing ada and fortran lines of code: some experimental results. Empir. Softw. Eng. 1(1), 45–59 (1996)CrossRefGoogle Scholar
  19. 19.
    Guo, P.J., Zimmermann, T., Nagappan, N., Murphy, B.: Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: Proceedings of the 32th International Conference on Software Engineering (2010)Google Scholar
  20. 20.
    Hanenberg, S.: Doubts about the positive impact of static type systems on programming tasks in single developer projects - an empirical study. In: ECOOP 2010 – Object-Oriented Programming: 24th European Conference, Maribor, June 21–25, 2010. Proceedings, pp. 300–303. Springer, Berlin (2010)CrossRefGoogle Scholar
  21. 21.
    Herbsleb, J.D., Mockus, A.: An empirical study of speed and communication in globally distributed software development. IEEE Trans. Softw. Eng. 29(6), 481–494 (2003)CrossRefGoogle Scholar
  22. 22.
    Höst, M., Regnell, B., Wohlin, C.: Using students as subjects – a comparative study of students and professionals in lead-time impact assessment. Empir. Softw. Eng. 5(3), 201–214 (2000)zbMATHCrossRefGoogle Scholar
  23. 23.
    Jaccheri, L., Morasca, S.: Involving industry professionals in empirical studies with students. In: Proceedings of the 2006 International Conference on Empirical Software Engineering Issues: Critical Assessment and Future Directions, pp. 152–152. Springer, Berlin (2007)Google Scholar
  24. 24.
    Jacobson, I., Bylund, S. (ed.): The Road to the Unified Software Development Process. Cambridge University Press, New York (2000)Google Scholar
  25. 25.
    Kampenes, V.B., Dybå, T., Hannay, J.E., Sjøberg, D.I.K.: A systematic review of quasi-experiments in software engineering. Inf. Softw. Technol. 51(1), 71–82 (2009)CrossRefGoogle Scholar
  26. 26.
    Kapser, C.J., Godfrey, M.W.: Cloning considered harmful considered harmful: patterns of cloning in software. Empir. Softw. Eng. 13(6), 645–692 (2008)CrossRefGoogle Scholar
  27. 27.
    Kemerer, C.F.: Reliability of function points measurement: a field experiment. Commun. ACM 36(2), 85–97 (1993)CrossRefGoogle Scholar
  28. 28.
    Khomh, F., Di Penta, M., Gueheneuc, Y.G.: An exploratory study of the impact of code smells on software change-proneness. In: 16th Working Conference on Reverse Engineering, 2009. WCRE ’09, pp. 75–84 (2009)Google Scholar
  29. 29.
    Kitchenham, B.A., Pfleeger, S.L., Pickard, L.M., Jones, P.W., Hoaglin, D.C., El Emam, K., Rosenberg, J.: Preliminary guidelines for empirical research in software engineering. IEEE Trans. Softw. Eng. 28(8), 721–734 (2002)CrossRefGoogle Scholar
  30. 30.
    Kitchenham, B.A., Dyba, T., Jorgensen, M.: Evidence-based software engineering. In: Proceedings of the 26th International Conference on Software Engineering, ICSE ’04, pp. 273–281. IEEE Computer Society, Washington (2004)Google Scholar
  31. 31.
    Knight, J.C., Leveson, N.G.: An experimental evaluation of the assumption of independence in multiversion programming. IEEE Trans. Softw. Eng. SE-12(1), 96–109 (1986)CrossRefGoogle Scholar
  32. 32.
    Knuth, D.E.: The Art of Computer Programming: Fundamental Algorithms. Addison-Wesley Publishing Company, Reading (1969)zbMATHGoogle Scholar
  33. 33.
    Knuth, D.E.: An empirical study of fortran programs. Softw.: Pract. Exp. 1(2), 105–133 (1971)zbMATHGoogle Scholar
  34. 34.
    Lake, A., Cook, C.R.: A software complexity metric for C++. Technical report, Oregon State University, Corvallis (1992)Google Scholar
  35. 35.
    Lampson, B.W.: A critique of an exploratory investigation of programmer performance under on-line and off-line conditions. IEEE Trans. Hum. Factors Electron. HFE-8(1), 48–51 (1967)CrossRefGoogle Scholar
  36. 36.
    Lencevicius, R.: Advanced Debugging Methods. The Springer International Series in Engineering and Computer Science. Springer, Berlin (2012)Google Scholar
  37. 37.
    Leveson, N.G., Cha, S.S., Knight, J.C., Shimeall, T.J.: The use of self checks and voting in software error detection: an empirical study. IEEE Trans. Softw. Eng. 16(4), 432–443 (1990)CrossRefGoogle Scholar
  38. 38.
    Merriam-Webster: Merriam-Webster online dictionary (2003)Google Scholar
  39. 39.
    Naur, P., Randell, B. (ed.): Software Engineering: Report of a Conference Sponsored by the NATO Science Committee. Brussels, Scientific Affairs Division, NATO, Garmisch (1969)Google Scholar
  40. 40.
    Ng, T.H., Cheung, S.C., Chan, W.K., Yu, Y.T.: Do maintainers utilize deployed design patterns effectively? In: 29th International Conference on Software Engineering, 2007. ICSE 2007, pp. 168–177 (2007)Google Scholar
  41. 41.
    Prechelt, L.: The 28:1 Grant-Sackman Legend is Misleading, Or: How Large is Interpersonal Variation Really. Interner Bericht. University Fakultät für Informatik, Bibliothek (1999)Google Scholar
  42. 42.
    Prechelt, L.: An empirical comparison of seven programming languages. Computer 33(10), 23–29 (2000)CrossRefGoogle Scholar
  43. 43.
    Rahman, F., Posnett, D., Herraiz, I., Devanbu, P.: Sample size vs. bias in defect prediction. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pp. 147–157. ACM, New York (2013)Google Scholar
  44. 44.
    Ramesh, V., Glass, R.L., Vessey, I.: Research in computer science: an empirical study. J. Syst. Softw. 70(1–2), 165–176 (2004)CrossRefGoogle Scholar
  45. 45.
    Ricca, F., Di Penta, M., Torchiano, M.: Guidelines on the use of fit tables in software maintenance tasks: lessons learned from 8 experiments. In: IEEE International Conference on Software Maintenance, 2008. ICSM 2008, pp. 317–326 (2008)Google Scholar
  46. 46.
    Ricca, F., Di Penta, M., Torchiano, M., Tonella, P., Ceccato, M., Visaggio, A.: Are fit tables really talking? A series of experiments to understand whether fit tables are useful during evolution tasks. In: International Conference on Software Engineering, pp. 361–370. IEEE Computer Society Press, Los Alamitos (2008)Google Scholar
  47. 47.
    Romano, D., Raila, P., Pinzger, M., Khomh, F.: Analyzing the impact of antipatterns on change-proneness using fine-grained source code changes. In: Proceedings of the 2012 19th Working Conference on Reverse Engineering, WCRE ’12, pp. 437–446. IEEE Computer Society, Washington (2012)Google Scholar
  48. 48.
    Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empir. Softw. Eng. 14(2), 131–164 (2008)CrossRefGoogle Scholar
  49. 49.
    Runeson, P., Host, M., Rainer, A., Regnell, B.: Case Study Research in Software Engineering: Guidelines and Examples, 1st edn. Wiley Publishing, Hoboken (2012)CrossRefGoogle Scholar
  50. 50.
    Salman, I., Misirli, A.T., Juristo, N.: Are students representatives of professionals in software engineering experiments? In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE), May, vol. 1, pp. 666–676 (2015)Google Scholar
  51. 51.
    Sammet, J.E.: Brief survey of languages used for systems implementation. SIGPLAN Not. 6(9), 1–19 (1971)CrossRefGoogle Scholar
  52. 52.
    Seaman, C.B.: Qualitative methods in empirical studies of software engineering. IEEE Trans. Softw. Eng. 25(4), 557–572 (1999)CrossRefGoogle Scholar
  53. 53.
    Sharafi, Z., Marchetto, A., Susi, A., Antoniol, G., Guéhéneuc, Y.-G.: An empirical study on the efficiency of graphical vs. textual representations in requirements comprehension. In: Poshyvanyk D., Di Penta M. (eds.) Proceedings of the 21st International Conference on Program Comprehension (ICPC), May. IEEE CS Press, Washington (2013)Google Scholar
  54. 54.
    Shull, F.J., Carver, J.C., Vegas, S., Juristo, N.: The role of replications in empirical software engineering. Empir. Softw. Eng. 13(2), 211–218 (2008)CrossRefGoogle Scholar
  55. 55.
    Sillito, J., Murphy, G.C., De Volder, K.: Questions programmers ask during software evolution tasks. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT ’06/FSE-14, pp. 23–34. ACM, New York (2006)Google Scholar
  56. 56.
    Sjoeberg, D.I.K., Hannay, J.E., Hansen, O., Kampenes, V.B., Karahasanovic, A., Liborg, N.K., Rekdal, A.C.: A survey of controlled experiments in software engineering. IEEE Trans. Softw. Eng. 31(9), 733–753 (2005)CrossRefGoogle Scholar
  57. 57.
    Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of the 2005 International Workshop on Mining Software Repositories MSR 2005, Saint Louis, MO, May 17, 2005Google Scholar
  58. 58.
    Soh, Z., Sharafi, Z., van den Plas, B., Cepeda Porras, G., Guéhéneuc, Y.-G., Antoniol, G.: Professional status and expertise for uml class diagram comprehension: an empirical study. In: van Deursen A., Godfrey M.W. (eds.) Proceedings of the 20th International Conference on Program Comprehension (ICPC), pp. 163–172. IEEE CS Press, Washington (2012)Google Scholar
  59. 59.
    Storey, M.A.D., Wong, K., Fong, P., Hooper, D., Hopkins, K., Muller, H.A.: On designing an experiment to evaluate a reverse engineering tool. In: Proceedings of the Third Working Conference on Reverse Engineering, 1996, Nov, pp. 31–40 (1996)Google Scholar
  60. 60.
    Swanson, E.B., Beath, C.M.: The use of case study data in software management research. J. Syst. Softw. 8(1), 63–71 (1988)CrossRefGoogle Scholar
  61. 61.
    Tatsubori, M., Chiba, S.: Programming support of design patterns with compile-time reflection. In: Fabre J.-C., Chiba S. (eds.) Proceedings of the 1st OOPSLA Workshop on Reflective Programming in C++ and Java, pp. 56–60. Center for Computational Physics, University of Tsukuba, October 1998. UTCCP Report 98-4Google Scholar
  62. 62.
    Thayer, R.H., Pyster, A., Wood, R.C.: The challenge of software engineering project management. Computer 13(8), 51–59 (1980)CrossRefGoogle Scholar
  63. 63.
    Tichy, W.F.: Hints for reviewing empirical work in software engineering. Empir. Softw. Eng. 5(4), 309–312 (2000)CrossRefGoogle Scholar
  64. 64.
    Tichy, W.F., Lukowicz, P., Prechelt, L., Heinz, E.A.: Experimental evaluation in computer science: a quantitative study. J. Syst. Softw. 28(1), 9–18 (1995)CrossRefGoogle Scholar
  65. 65.
    Tiedeman, M.J.: Post-mortems-methodology and experiences. IEEE J. Sel. Areas Commun. 8(2), 176–180 (1990)CrossRefGoogle Scholar
  66. 66.
    van Solingen, R., Berghout, E.: The Goal/Question/Metric Method: A Practical Guide for Quality Improvement of Software Development. McGraw-Hill, London (1999)Google Scholar
  67. 67.
    von Mayrhauser, A.: Program comprehension during software maintenance and evolution. IEEE Comput. 28(8), 44–55 (1995)CrossRefGoogle Scholar
  68. 68.
    Walker, R.J., Baniassad, E.L.A., Murphy, G.C.: An initial assessment of aspect-oriented programming. In: Proceedings of the 1999 International Conference on Software Engineering, 1999, May, pp. 120–130 (1999)Google Scholar
  69. 69.
    Williams, C., Spacco, J.: Szz revisited: verifying when changes induce fixes. In: Proceedings of the 2008 Workshop on Defects in Large Software Systems, DEFECTS ’08, pp. 32–36. ACM, New York (2008)Google Scholar
  70. 70.
    Wohlin, C., Runeson, P., Host, M., Ohlsson, M.C., Regnell, B., Wesslen, A.: Experimentation in Software Engineering: An Introduction, 1st edn. Kluwer Academic Publishers, Boston (1999)zbMATHGoogle Scholar
  71. 71.
    Yin, R.K.: Case Study Research: Design and Methods. Applied Social Research Methods. SAGE Publications, London (2009)Google Scholar
  72. 72.
    Zeller, A., Zimmermann, T., Bird, C.: Failure is a four-letter word: a parody in empirical research. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, Promise ’11, pp. 5:1–5:7. ACM, New York (2011)Google Scholar
  73. 73.
    Zendler, A.: A preliminary software engineering theory as investigated by published experiments. Empir. Softw. Eng. 6(2), 161–180 (2001)zbMATHCrossRefGoogle Scholar
  74. 74.
    Zhang, C., Budgen, D.: What do we know about the effectiveness of software design patterns? IEEE Trans. Softw. Eng. 38(5), 1213–1231 (2012)CrossRefGoogle Scholar
  75. 75.
    Zimmermann, T., Nagappan, N.: Predicting subsystem failures using dependency graph complexities. In: Proceedings of the The 18th IEEE International Symposium on Software Reliability, ISSRE ’07, pp. 227–236. IEEE Computer Society, Washington (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Polytechnque Montréal and Concordia UniversitMontrealCanada
  2. 2.Polytechnque MontréalMontrealCanada

Personalised recommendations