Empirical Software Engineering

, Volume 23, Issue 4, pp 2362–2397 | Cite as

What are the effects of history length and age on mining software change impact?

  • Leon Moonen
  • Thomas Rolfsnes
  • Dave Binkley
  • Stefano Di Alesio


The goal of Software Change Impact Analysis is to identify artifacts (typically source-code files or individual methods therein) potentially affected by a change. Recently, there has been increased interest in mining software change impact based on evolutionary coupling. A particularly promising approach uses association rule mining to uncover potentially affected artifacts from patterns in the system’s change history. Two main considerations when using this approach are the history length, the number of transactions from the change history used to identify the impact of a change, and history age, the number of transactions that have occurred since patterns were last mined from the history. Although history length and age can significantly affect the quality of mining results, few guidelines exist on how to best select appropriate values for these two parameters. In this paper, we empirically investigate the effects of history length and age on the quality of change impact analysis using mined evolutionary coupling. Specifically, we report on a series of systematic experiments using three state-of-the-art mining algorithms that involve the change histories of two large industrial systems and 17 large open source systems. In these experiments, we vary the length and age of the history used to mine software change impact, and assess how this affects precision and applicability. Results from the study are used to derive practical guidelines for choosing history length and age when applying association rule mining to conduct software change impact analysis.


Change impact analysis Evolutionary coupling Association rule mining Parameter tuning 



This work is supported by the Research Council of Norway through the EvolveIT project (#221751/F20) and the Certus SFI (#203461/030). Dr. Binkley was supported by NSF grant IIA-1360707 and a J. William Fulbright award.


  1. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD international conference on management of data. ACM, pp 207–216Google Scholar
  2. Alali A (2008) An empirical characterization of commits in software repositories. Ms.c. Kent State University, 53Google Scholar
  3. Alali A, Kagdi H, Maletic JI (2008) What’s a typical commit? A characterization of open source software repositories. In: International conference on program comprehension (ICPC). IEEE, pp 182–191Google Scholar
  4. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM, p 513Google Scholar
  5. Bohner S, Arnold R (1996) Software change impact analysis. IEEE, USAGoogle Scholar
  6. Canfora G, Cerulo L (2005) Impact analysis by mining software and change request repositories. In: International software metrics symposium (METRICS). IEEE, pp 29–37Google Scholar
  7. Eick S et al (2001) Does code decay? Assessing the evidence from change management data. IEEE Trans Softw Eng 27(1):1–12CrossRefGoogle Scholar
  8. Gall H, Hajek K, Jazayeri M (1998) Detection of logical coupling based on product release history. In: IEEE international conference on software maintenance (ICSM). IEEE, pp 190–198Google Scholar
  9. German DM (2006) An empirical study of fine-grained software modifications. Empir Softw Eng 11(3):369–393CrossRefGoogle Scholar
  10. Gethers M et al (2011) An adaptive approach to impact analysis from change requests to source code. In: IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 540–543Google Scholar
  11. Graves T L et al (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661CrossRefGoogle Scholar
  12. Hassan AE (2008) The road ahead for Mining Software Repositories. In: Frontiers of software maintenance. IEEE, pp 48–57Google Scholar
  13. Hassan AE, Holt R (2004) Predicting change propagation in software systems. In: IEEE international conference on software maintenance (ICSM). IEEE, pp 284–293Google Scholar
  14. Jaafar F et al (2014) Detecting asynchrony and dephase change patterns by mining software repositories. J Softw: Evol Process 26(1):77–106Google Scholar
  15. Jashki M-A, Zafarani R, Bagheri E (2008) Towards a more efficient static software change impact analysis method. In: ACM SIGPLAN-SIGSOFT workshop on program analysis for software tools and engineering (PASTE). ACM, pp 84–90Google Scholar
  16. Jiang N, Gruenwald L (2006) Research issues in data stream association rule mining. ACM SIGMOD Rec 35(1):14–19CrossRefGoogle Scholar
  17. Kagdi H, Yusuf S, Maletic JI (2006) Mining sequences of changed-files from version histories. In: International workshop on mining software repositories (MSR). ACM, pp 47–53Google Scholar
  18. Kagdi H, Gethers M, Poshyvanyk D (2013) Integrating conceptual and logical couplings for change impact analysis in software. Empir Softw Eng 18(5):933–969CrossRefGoogle Scholar
  19. Kolassa C, Riehle D, Salim MA (2013) The empirical commit frequency distribution of open source projects. In: International Symposium On Open Collaboration (WikiSym). ACM, pp 1–8Google Scholar
  20. Law J, Rothermel G (2003) Whole program path-based dynamic impact analysis. In: International conference on software engineering (ICSE). IEEE, pp 308–318Google Scholar
  21. Lin W, Alvarez SA, Ruiz C (2002) Efficient adaptive-support association rule mining for recommender systems. Data Min Knowl Disc 6(1):83–105MathSciNetCrossRefGoogle Scholar
  22. Maimon O, Rokach L (1383) In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, BerlinGoogle Scholar
  23. Moonen L et al (2016a) Exploring the effects of history length and age on mining software change impact. In: IEEE international working conference on source code analysis and manipulation (SCAM), pp 207– 216Google Scholar
  24. Moonen L et al (2016b) Practical guidelines for change recommendation using association rule mining. In: International conference on automated software engineering (ASE). ACM, pp 732–743Google Scholar
  25. Podgurski A, Clarke L (1990) A formal model of program dependences and its implications for software testing, debugging, and maintenance. IEEE Trans Softw Eng 16(9):965–979CrossRefGoogle Scholar
  26. Ren X et al (2004) Chianti: a tool for change impact analysis of java programs. In: ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications (OOPSLA), pp 432–448Google Scholar
  27. Robbes R, Pollet D, Lanza M (2008) Logical coupling based on fine- grained change information. In: Working conference on reverse engineering (WCRE). IEEE, pp 42–46Google Scholar
  28. Rolfsnes T et al (2016a) Generalizing the analysis of evolutionary coupling for software change impact analysis. In: International conference on software analysis, evolution, and reengineering (SANER). IEEE, pp 201–212Google Scholar
  29. Rolfsnes T et al (2016b) Improving change recommendation using aggregated association rules. In: International conference on mining software repositories (MSR). ACM, pp 73–84Google Scholar
  30. Schuirmann D (1981) On hypothesis testing to determine if the mean of a normal distribution is contained in a known interval. BiometricsGoogle Scholar
  31. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: International conference on knowledge discovery and data mining (KDD). AASI, pp 67–73Google Scholar
  32. Westlake W (1981) Response to T.B.L. Kirkwood: bioequivalence testing—a need to rethink. Biometrics 37:589–594CrossRefGoogle Scholar
  33. Yazdanshenas AR, Moonen L (2011) Crossing the boundaries while analyzing heterogeneous component-based software systems. In: IEEE international conference on software maintenance (ICSM). IEEE, pp 193–202Google Scholar
  34. Ying ATT et al (2004) Predicting source code changes by mining change history. IEEE Trans Softw Eng 30(9):574–586CrossRefGoogle Scholar
  35. Zanjani M B, Swartzendruber G, Kagdi H (2014) Impact analysis of change requests on source code based on interaction and commit histories. In: International working conference on mining software repositories (MSR), pp 162–171Google Scholar
  36. Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 401–406Google Scholar
  37. Zimmermann T et al (2005) Mining version histories to guide software changes. IEEE Trans Softw Eng 31(6):429–445CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Simula Research LaboratoryOsloNorway
  2. 2.Loyola University MarylandBaltimoreUSA

Personalised recommendations