Skip to main content
Log in

ID-correspondence: a measure for detecting evolutionary coupling

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Evolutionary coupling is a well investigated phenomenon in software maintenance research and practice. Association rules and two related measures, support and confidence, have been used to identify evolutionary coupling among program entities. However, these measures only emphasize the co-change (i.e., changing together) frequency of entities and cannot determine whether the entities co-evolved by experiencing related changes. Consequently, the approach reports false positives and fails to detect evolutionary coupling among infrequently co-changed entities. We propose a new measure, identifier correspondence (id-correspondence), that quantifies the extent to which changes that occurred to the co-changed entities are related based on identifier similarity. Identifiers are the names given to different program entities such as variables, methods, classes, packages, interfaces, structures, unions etc. We use Dice-Sørensen co-efficient for measuring lexical similarity between the identifiers involved in the changed lines of the co-changed entities. Our investigation on thousands of revisions from nine subject systems covering three programming languages shows that id-correspondence can considerably improve the detection accuracy of evolutionary coupling. It outperforms the existing state-of-the-art evolutionary coupling based techniques with significantly higher recall and F-score in predicting future co-change candidates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Sørenson–dice coefficient. https://en.wikipedia.org/wiki/s%c3%b8rensen%e2%80%93dice_coefficient

  2. Strike a match. http://www.catalysoft.com/articles/strikeamatch.html.

  3. SourceForge. https://sourceforge.net/.

  4. Exuberant CTAGS. https://sourceforge.net/projects/ctags/.

  5. Implementation and database. https://drive.google.com/open?id=17biLYZu-nfzj_wiMG-PTvfiuuiXOWDXR.

  6. Wilcoxon signed rank test online. http://www.statskingdom.com/175wilcoxon/signed/ranks.html.

  7. Wilcoxon signed rank test. https://en.wikipedia.org/wiki/wilcoxon/signed-rank/test.

  8. Sørenson–dice coefficient. https://en.wikipedia.org/wiki/s%c3%b8rensen%e2%80%93dice_coefficient

References

  • Agrawal R, Imieliski T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’93) 22(2):207–216

    Article  Google Scholar 

  • Ahsan S N, Wotawa F (2011) Fault prediction capability of program file’s logical-coupling metrics. In: Proceedings of 2011 joint conference of the 21st int’l workshop on and 6th int’l conference on software process and product software measurement (IWSM-MENSURA’11), pp 257–262

  • Alali A, Bartman B, Newman C D, Maletic J I (2013) A preliminary investigation of using age and distance measures in the detection of evolutionary couplings. In: Proceedings of the 10th working conference on mining software repositories (MSR’13), pp 169–172

  • Ali N, Jaafar F, Hassan A E (2013) Leveraging historical co-change information for requirements traceability. In: Proceedings of the 2013 working conference on reverse engineering (WCRE’13), pp 361–370

  • Bantelay F, Zanjani M B, Kagdi H (2013) Comparing and combining evolutionary couplings from interactions and commits. In: Proceedings of the 2013 working conference on reverse engineering (WCRE’13), pp 311–320

  • Bavota G, Dit B, Oliveto R, Penta M D, Poshyvanyk D, Lucia A D (2013) An empirical study on the developers’ perception of software coupling. In: Proceedings of the international conference on software engineering (ICSE’13), pp 692–701

  • Brindescu C, Codoban M, Shmarkatiuk S, Dig D (2014) How do centralized and distributed version control systems impact software changes?. In: ICSE, pp 322–333

  • Canfora G, Cerulo L, Penta M D (2006) On the use of line co-change for identifying crosscutting concern code. In: Proceeding of the international conference on software maintenance (ICSM’06), pp 213–222

  • Canfora G, Ceccarelli M, Cerulo L, Penta M D (2010) Using multivariate time series and association rules to detect logical change coupling: an empirical study. In: Proceedings of the IEEE international conference on software maintenance (ICSM’10), pp 1–10

  • Ceccarelli M, Cerulo L, Canfora G, Penta M D (2010) An eclectic approach for change impact analysis. In: Proceedings of the international conference on software engineering (ICSE’10), pp 163–166

  • D’Ambros M, Lanza M (2006) Reverse engineering with logical coupling. In: Proceedings of the 13th working conference on reverse engineering (WCRE’06), pp 189–198

  • Dice L R (1945) Measures of the amount of ecologic association between species. Ecology 26(3):297–302

    Article  Google Scholar 

  • Gall H, Hajek K, Jazayeri M (1998) Detection of logical coupling based on product release history. In: Proceedings of the international conference on software maintenance (ICSM’98), pp 190–198

  • Gall H, Jazayeri M, Krajewski J (2003) CVS Release history data for detecting logical couplings. In: Proceedings of the 6th international workshop on principles of software evolution (IWPSE’03), pp 13–23

  • Hanakawa N (2007) Visualization for software evolution based on logical coupling and module coupling. In: Proceedings of the 14th Asia-Pacific software engineering conference (APSEC’07), pp 214–221

  • Hotta K, Sano Y, Higo Y, Kusumoto S (2010) Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software. In: Proceedings of the international workshop on principles of software evolution (IWPSE’10), pp 73–82

  • Islam J F, Mondal M, Roy C K (2016) Bug replication in code clones: an empirical study. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering (SANER’16), pp 68–78

  • Islam M A, Islam M, Mondal M, Roy B, Roy C K, Schneider K A (2018) Detecting evolutionary coupling using transitive association rules. In: SCAM, pp 113–122

  • Itkonen J, Hillebrand M, Lappalainen V (2004) Application of relation analysis to a small java software. In: Proceedings of the 8th European conference on software maintenance and reengineering (CSMR’04), pp 233–239

  • Jaafar F, Gueheneuc Y, Hamel S, Antoniol G (2011) An exploratory study of macro co-changes. In: Proceedings of the 2011 working conference on reverse engineering (WCRE’11), pp 325–334

  • Kagdi H, Gethers M, Poshyvanyk D, Collard M L (2010) Blending conceptual and evolutionary couplings to support change impact analysis in source code. In: Proceedings of the 17th IEEE working conference on reverse engineering (WCRE’10), pp 119–128

  • Kagdi H, Gethers M, Poshyvanyk D (2013) Integrating conceptual and logical couplings for change impact analysis in software. Empir Softw Eng 18 (5):933–969

    Article  Google Scholar 

  • Kamiya T, Kusumoto S, Inoue K (2002) CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans Softw Eng 28(7):654–670

    Article  Google Scholar 

  • Kotsiantis S, Kanellopoulos D (2006) Association rules mining: a recent overview. GESTS International Transactions on Computer Science and Engineering 32(1):71–82

    Google Scholar 

  • Krinke J (2011) Is cloned code older than non-cloned code?. In: Proceedings of the 5th international workshop on software clones (IWSC’11), pp 28–33

  • Lozano A, Wermelinger M (2008) Assessing the effect of clones on changeability. In: Proceedings of the IEEE international conference on software maintenance (ICSM’08), pp 227–236

  • Lozano A, Wermelinger M (2010) Tracking clones’ imprint. In: Proceedings of the 4th international workshop on software clones (IWSC’10), pp 65–72

  • Mondal M, Roy C K, Schneider K A (2012) An empirical study on clone stability. ACM SIGAPP Applied Computing Review 12(3):20–36

    Article  Google Scholar 

  • Mondal M, Roy C K, Schneider K A (2013a) Improving the detection accuracy of evolutionary coupling. In: Proceedings of the IEEE 21st international conference on program comprehension (ICPC’13), pp 223–226

  • Mondal M, Roy C K, Schneider K A (2013b) An insight into the dispersion of changes in cloned and non-cloned code: a genealogy based empirical study. Science of Computer Programming Journal 95(4):445–468

    Google Scholar 

  • Mondal M, Roy C K, Schneider K A (2014) Improving the detection accuracy of evolutionary coupling by measuring change correspondence. In: Proceedings of the IEEE conference on software maintenance, reengineering and reverse engineering (CSMR-WCRE’14), Software Evolution Week, pp 358– 362

  • Mondal M, Roy C K, Schneider K A (2016) An empirical study on ranking change recommendations retrieved using code similarity. In: Proceedings of the 10th international workshop on software clones (IWSC’16), pp 44–50

  • Mondal M, Rahman M S, Roy C K, Schneider K A (2018) Is cloned code really stable? Empir Softw Eng 23(2):693–770

    Article  Google Scholar 

  • Mondal M, Roy B, Roy C K, Schneider K A (2019) Investigating context adaptation bugs in code clones. In: ICSME, pp 157–168

  • Mondal M, Roy B, Roy C K, Schneider K A (2020a) Associating code clones with association rules for change impact analysis. In: SANER, pp 93–103

  • Mondal M, Roy B, Roy C K, Schneider K A (2020b) Investigating near-miss micro-clones in evolving software. In: ICPC, p 11

  • Oliva G A, Gerosa M A (2011) On the interplay between structural and logical dependencies in open-source software. In: Proceedings of the 25th Brazilian symposium on software engineering (SBES’11), pp 144–153

  • Poshyvanyk D, Marcus A (2006) The conceptual coupling metrics for object-oriented systems. In: Proceedings of the international conference on software maintenance (ICSM’06), pp 469–478

  • Pugh S, Binkley D, Moonen L (2018) The case for adaptive change recommendation. In: SCAM, pp 129–138

  • Robbes R, Pollet D, Lanza M (2008) Logical coupling based on fine-grained change information. In: Proceedings of the 2008 working conference on reverse engineering (WCRE’08), pp 42–46

  • Rolfsnes T, Alesio S D, Behjati R, Moonen L, Binkley D W (2016) Generalizing the analysis of evolutionary coupling for software change impact analysis. In: Proceedings of the 24th IEEE international conference on software analysis, evolution, and reengineering (SANER’16), pp 201–212

  • Romano J, Kromrey J D, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: should we really be using t-test and cohen’s d for evaluating group differences on the nsse and other surveys?. In: Annual meeting of the florida association of institutional research

  • Sørensen T (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on danish commons. Kongelige Danske Videnskabernes Selskab 5(4):1–34

    Google Scholar 

  • Sun X, Li B, Tao C, Wen W, Zhang S (2010) Change impact analysis based on a taxonomy of change types. In: COMPSAC, pp 373–382

  • Tantithamthavorn C, Ihara A, Matsumoto K (2013) Using co-change histories to improve bug localization performance. In: ACIS, pp 543–548

  • Wenzel S, Hutter H, Kelter U (2007) Tracing model elements. In: ICSM, pp 104–113

  • Wong S, Cai Y (2011) Generalizing evolutionary coupling with stochastic dependencies. In: Proceedings of the 26th IEEE/ACM international conference on automated software engineering (ASE’11), pp 293–302

  • Ying A T T, Murphy G C, Ng R, Chu-Carroll M C (2004) Predicting source code changes by mining change history. IEEE Trans Softw Eng 30 (9):574–586

    Article  Google Scholar 

  • Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of the 26th international conference on software engineering (ICSE’04), pp 563–572

Download references

Acknowledgments

This research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), and by two Canada First Research Excellence Fund (CFREF) grants coordinated by the Global Institute for Food Security (GIFS) and the Global Institute for Water Security (GIWS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manishankar Mondal.

Additional information

Communicated by: Andrea De Lucia

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mondal, M., Roy, B., Roy, C.K. et al. ID-correspondence: a measure for detecting evolutionary coupling. Empir Software Eng 26, 5 (2021). https://doi.org/10.1007/s10664-020-09921-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-020-09921-9

Keywords

Navigation