Abstract
As developers modify software entities such as functions or variables to introduce new features, enhance old ones, or fix bugs, they must ensure that other entities in the software system are updated to be consistent with these new changes. Many hard to find bugs are introduced by developers who did not notice dependencies between entities, and failed to propagate changes correctly. Most modern development environments offer tools to assist developers in propagating changes. For example, dependency browsers show static code dependencies between source code entities. Other sources of information such as historical co-change or code layout information could be used by tools to support developers in propagating changes. We present the Development Replay (DR) approach which empirically assess and compares the effectiveness of several not-yet-existing change propagation tools by reenacting the changes stored in source control repositories using these tools. We present a case study of five large open source systems with a total of over 40 years of development history. Our empirical results show that historical co-change information recovered from source control repositories along with code layout information can guide developers in propagating changes better than simple static dependency information.
Similar content being viewed by others
Notes
Using a parametric paired \(t\)-test and a non-parametric paired Wilcoxon signed rank test. The \(t\)-test is performed on the square root of the precision/recall for each change set to ensure that the data has a normal distribution, a requirement for the \(t\)-test. Due to the large number of change sets used in our analysis, the normality of the data is not a major concern as the \(t\)-test is a robust test. Nevertheless we ensure the normality to guarantee the validity of our results.
References
Anquetil N, Lethbridge T (1998, April) Extracting concepts from file names: a new file clustering criterion. In: Proceedings of the 20th International Conference on Software Engineering. Kyoto, Japan, pp 84–93
Arnold R, Bohner S (1993) Impact analysis—toward a framework for comparison. In: Proceedings of the 13th International Conference on Software Maintenance. Montréal, Quebec, Canada, pp 292–301
Atkins D, Ball T, Graves T, Mockus A (1999, May) Using version control data to evaluate the effectiveness of software tools. In: Proceedings of the 21st International Conference on Software Engineering. Los Angeles, California, pp 324–333
Baniassad EL, Murphy GC, Schwanninger C, Kircher M (2002, April) Managing crosscutting concerns during software evolution tasks: an inquisitive study. In: Proceedings of the 1st IEEE International Conference on Aspect-oriented Software Development. Enschede, The Netherlands, pp 120–126
Baniassad EL, Murphy GC, Schwanninger C (2003, May) Design pattern rationale graphs: linking design to source. In: Proceedings of the 25th International Conference on Software Engineering. Portland, Oregon
Bauer A, Pizka M (2003, September) The contribution of free software to software evolution. In: Proceedings of the 6th IEEE International Workshop on Principles of Software Evolution. Helsinki, Finland,
Belkin NJ (1977) The problem of matching in information retrieval. In: Theory and Application of Information Research, the Second International Research Forum in Information Science. Copenhagen, Netherlands, pp 187–197
Bohner S, Arnold R (1996) Software change impact analysis. IEEE Computer Soc
Bowman IT, Holt RC (1999, May) Reconstructing ownership architectures to help understand software systems. In: Proceedings of the 7th International Workshop on Program Comprehension. Pittsburgh, Pennsylvania
Briand LC, Wüst J, Lounis H (1999, August) Using coupling measurement for impact analysis in object-oriented systems. In: Proceedings of the 15th International Conference on Software Maintenance. Oxford, England, UK, pp 475–482
Brooks FP (1974) The mythical man-month: essays on software engineering. Addison Wesley Professional
Chen A, Chou E, Wong J, Yao AY, Zhang Q, Zhang S, Michail A (2001) CVSSearch: searching through source code using CVS comments. In: Proceedings of the 17th International Conference on Software Maintenance. Florence, Italy, pp 364–374
Chen K, Schach SR, Yu L, Offutt J, Heller GZ (2004) Open-source change logs. Empirical Software Engineering 9(197):210
Cubranic D, Murphy GC (2003, May) Hipikat: recommending pertinent software development artifacts. In: Proceedings of the 25th International Conference on Software Engineering. Portland, Oregon, pp 408–419
Eick SG, Steffen JL, Eric J, Sumner E (1992) Seesoft—a tool for visualizing line oriented software statistics. IEEE Trans Softw Eng 18(11):957–968
Fenton N, Pfleeger SL, Glass RL (1994) Science and substance: A challenge to software engineers. IEEE Softw 11(4):86–95
Finnigan PJ, Holt RC, Kalas I, Kerr S, Kontogiannis K, Müller HA, Mylopoulos J, Perelgut SG, Stanley M, Wong K (1997) The software bookshelf. IBM Syst J 36(4):564–593.
Gall H, Hajek K, Jazayeri M (1998, November) Detection of logical coupling based on product release history. In: Proceedings of the 14th International Conference on Software Maintenance. Bethesda, Washington, District of Columbia
Gallagher KB, Lyle JR (1991) Using program slicing in software maintenance. IEEE Trans Softw Eng 17(8):751–761
Glass RL (2003) Questioning the software engineering unquestionables. IEEE Softw 20(3):119–120
Graves TL, Karr AF, Marron JS, Siy HP (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661
Hassan AE, Holt RC (2004a, May) C-REX: an evolutionary code extractor for C.
Hassan AE, Holt RC (2004b, September) Predicting change propagation in software systems. In: Proceedings of the 20th International Conference on Software Maintenance. Chicago, USA
Hassan AE, Holt RC (2005, Sept) The top ten list: dynamic fault prediction. In: Proceedings of the 21th International Conference on Software Maintenance. Budapest, Hungary
Hassan AE, Jiang ZM, Holt RC (2005, November) Source versus object code extraction for recovering software architecture. In: Proceedings of the 12th Working Conference on Reverse Engineering. Pittsburgh, USA
Hull DA (1998) The TREC-7 filtering track: description and analysis. In: Voorhees EM, Harman DK (eds) Proceedings of TREC-7, 7th Text Retrieval Conference. National Institute of Standards and Technology, Gaithersburg, USA, pp 33–56
Kiczales G, Lamping J, Menhdhekar A, Maeda C, Lopes C, Loingtier J-M, Irwin J, (1997) Aspect-oriented programming. In: Akit M, Matsuoka S (eds) Proceedings of the 11th European Conference on Object-oriented Programming, vol. 1241. Springer, Berlin Heidelberg New York, pp 220–242
Kitchenham BA, Pickard SLPLM, Jones PW, Hoaglin DC, Emam KE, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734
Lee EHS (2000) Software comprehension across levels of abstraction. Master's thesis, University of Waterloo
Lehman MM, Belady LA (1985) Program evolution—process of sofware change. Academic, London
Miller RG (1981) Simultaneous statistical inference. Springer, Berlin Heidelberg New York
Mitchell M (2000, October) GCC 3.0 State of the Source. In: 4th Annual Linux Showcase and Conference. Atlanta, Georgia
Mockus A, Votta LG (2000, October) Identifying reasons for software change using historic databases. In: Proceedings of the 16th International Conference on Software Maintenance. San Jose, California, pp 120–130
Mockus A, Fielding RT, Herbsleb JD (2000, June) A case study of open source software development: the apache server. In: Proceedings of the 22nd International Conference on Software Engineering. ACM, Limerick, Ireland, pp 263–272
Parnas D (1972) On the criteria to be used in decomposing systems into modules. Commun ACM 15(12):1053–1058
Parnas D (1994, May) Software aging. In: Proceedings of the 16th International Conference on Software Engineering. Sorrento, Italy, pp 279–287
Penny DA (1992) The software landscape: a visual formalism for programming-in-the-large. PhD thesis, University of Toronto
Perry DE, Porter AA, Votta LG (2000, June) Empirical studies of software engineering: a roadmap. In: Proceedings of the 22nd International Conference on Software Engineering (ICSE)—Future of SE Track. Limerick, Ireland, pp 345–355
Rajlich V (1997) A model for change propagation based on graph rewriting. In: Proceedings of the 13th International Conference on Software Maintenance. Bari, Italy, pp 84–91
Rice J (1995) Mathematical statistics and data analysis. Duxbury
Robillard MP, Murphy GC (2002, May) Concern graphs: finding and describing concerns using structural program dependencies. In: Proceedings of the 24th International Conference on Software Engineering. Orlando, Florida
Rumbaugh J, Blaha M, Premerlani W, Eddy F, Lorensen W (1991) Object-oriented modeling and design. Prentice-Hall, Englewood Cliffs, New Jersey
Shirabad JS (2003) Supporting software maintenance by mining software update records. PhD thesis, University of Ottawa
Sim SE, Clarke CLA, Holt RC (1998, June) Archetypal source code searching: a survey of software developers and maintainers. In: Proceedings of the 6th International Workshop on Program Comprehension. Ischia, Italy, pp 180–187
Sniff+ Home Page. Available online at http://www.takefive.com/
Stephen PS, Eick G, Mockus A, Graves TL, Karr AF (2002) Visualizing software changes. IEEE Trans Softw Eng 28(4):396–412
van Rijsbergen CJ (1979) Information retrieval. Butterworths, London. Available online at http://www.dcs.gla.ac.uk/Keith/Preface.html
Weinberg Z (2003, May) A maintenance programmer's view of GCC. In: First Annual GCC Developers' Summit. Ottawa, Canada
Xing Z, Stroulia E (2004, June) Understanding class evolution in object-oriented systems. In: Proceedings of the 12th International Workshop on Program Comprehension. Bari, Italy
Yau S, Nicholl R, Tsai J, Liu S (1988) An integrated life-cycle model for software maintenance. IEEE Trans Softw Eng 15(7):58–95
Ye Y, Kishida K (2003, May) Toward an understanding of the motivation of open source software developers. In: Proceedings of the 22nd International Conference on Software Engineering. ACM, Portland, Oregon, pp 419–429
Yin RK (1994) Case study research: design and methods. Sage, Thousand Oaks, California
Ying AT (2003) Predicting source code changes by mining revision history. Master's thesis, University of British Colombia
Zimmermann T, Weißgerber P, Diehl S, Zeller A (2004, May) Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering. Edinburgh, UK
Zipf GK (1949) Human behavior and the principle of least effort. Addison-Wesley
Acknowledgments
The authors gratefully acknowledge the significant contributions from the members of the open source community who have given freely of their time to produce large software systems with rich and detailed source code repositories; and who assisted us in understanding and acquiring these valuable repositories. They also thank Lionel Briand for his very helpful comments and suggestions to improve the statistical analysis and presentation of our results.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hassan, A.E., Holt, R.C. Replaying development history to assess the effectiveness of change propagation tools. Empir Software Eng 11, 335–367 (2006). https://doi.org/10.1007/s10664-006-9006-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-006-9006-4