Skip to main content

Automatically identifying changes that impact code-to-design traceability during evolution

Abstract

An approach is presented that automatically determines if a given source code change impacts the design (i.e., UML class diagram) of the system. This allows code-to-design traceability to be consistently maintained as the source code evolves. The approach uses lightweight analysis and syntactic differencing of the source code changes to determine if the change alters the class diagram in the context of abstract design. The intent is to support both the simultaneous updating of design documents with code changes and bringing old design documents up to date with current code given the change history. An efficient tool was developed to support the approach and is applied to an open source system. The results are evaluated and compared against manual inspection by human experts. The tool performs better than (error prone) manual inspection. The developed approach and tool were used to empirically investigate and understand how changes to source code (i.e., commits) break code-to-design traceability during evolution and the benefits from such understanding. Commits are categorized as design impact or no impact. The commits of four open source projects over 3-year time durations are extracted and analyzed. The results of the study show that most of the code changes do not impact the design and these commits have a smaller number of changed files and changed less lines compared to commits with design impact. The results also show that most bug fixes do not impact design.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Notes

  1. 1.

    Complete study is at www.sdml.info/downloads/designstudy.pdf.

References

  1. Alali, A., Kagdi, H., & Maletic, J. I. (2008). What’s a typical commit? A characterization of open source software repositories. In Proceedings of 16th IEEE international conference on program comprehension (ICPC’08) (pp. 182–191).

  2. Antoniol, G., Canfora, G., Casazza, G., & De Lucia, A. (2000a). Information retrieval models for recovering traceability links between code and documentation. In Proceedings of IEEE international conference on software maintenance (ICSM’00), San Jose, CA (pp. 40–51).

  3. Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., & Merlo, E. (2002). Recovering traceability links between code and documentation. IEEE Transactions on Software Engineering, 28(10), 970–983.

    Article  Google Scholar 

  4. Antoniol, G., Canfora, G., Casazza, G., & Lucia, A. D. (2001). Maintaining traceability links during object-oriented software evolution. Software-Practice and Experience, 31(4), 331–355.

    MATH  Article  Google Scholar 

  5. Antoniol, G., Canfora, G., Casazza, G., Lucia, A. D., & Merlo, E. (2000b). Tracing object-oriented code into functional requirements. In Proceedings of 8th international workshop on program comprehension (IWPC’00), Limerick Ireland (pp. 227–230).

  6. Antoniol, G., Caprile, B., Potrich, A., & Tonella, P. (2000c). Design-code traceability for object-oriented systems. Annals of Software Engineering, 9(1–4), 35–58.

    Article  Google Scholar 

  7. Antoniol, G., Di Penta, M., & Merlo, E. (2004). An automatic approach to identify class evolution discontinuities. In Proceedings of 7th international workshop on principles of software evolution (IWPSE’04), Japan (pp. 31–40).

  8. Apiwattanapong, T., Orso, A., & Harrold, M. J. (2004). A differencing algorithm for object-oriented programs. In Proceedings of 19th international conference on automated software engineering (ASE’04) (pp. 2–13).

  9. Aversano, L., Canfora, G., Cerulo, L., Grosso, C. D., & Penta, M. D. (2007). An empirical study on the evolution of design patterns. In Proceedings of 6th joint meeting of the european software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, Dubrovnik, Croatia (pp. 385–394).

  10. Beyer, D., & Noack, A. (2005). clustering software artifacts based on frequent common changes. In Proceedings of 13th international workshop on program comprehension (IWPC’05) (pp. 259–268).

  11. Collard, M. L., Kagdi, H. H., & Maletic, J. I. (2003). An XML-based lightweight C++ fact extractor. In Proceedings of 11th IEEE international workshop on program comprehension (IWPC’03), Portland, OR, IEEE-CS (pp. 134–143).

  12. Cysneiros, G., & Zisman, A. (2008). Traceability and completeness checking for agent-oriented systems. In Proceedings of 2008 ACM symposium on applied computing, Brazil (pp. 71–77).

  13. De Lucia, A., Oliveto, R., & Tortora, G. (2008). ADAMS re-trace: A traceability link recovery via latent semantic indexing. In Proceedings of 30th international conference on software engineering (ICSE’08), Leipzig, Germany (pp. 839–842).

  14. Feilkas, M., Ratiu, D., & Jurgens, E. (2009). The loss of architectural knowledge during system evolution: An industrial case study. In Proceedings of 17th IEEE international conference on program comprehension (ICPC’09), Vancouver, Canada (pp. 188–197).

  15. Fluri, B., & Gall, H. (2006). Classifying change types for qualifying change couplings. In Proceedings of 14th IEEE international conference on program comprehension (ICPC’06), Athens, Greece (pp. 35–45).

  16. Fluri, B., Wursch, M., & Gall, H. C. (2007). Do code and comments co-evolve? On the relation between source code and comment changes. In Proceedings of 14th working conference on reverse engineering (WCRE’07) (pp. 70–79).

  17. Hammad, M., Collard, M. L., & Maletic, J. I. (2009). Automatically identifying changes that impact code-to-design traceability. In Proceedings of 17th IEEE international conference on program comprehension (ICPC’09), Vancouver, Canada (pp. 20–29).

  18. Hattori, L. P., & Lanza, M. (2008). On the nature of commits. In Proceedings of 23rd IEEE/ACM international conference on automated software engineeringworkshops (ASE’08) (pp. 63–71).

  19. Hayes, J. H., Dekhtyar, A., & Sundaram, S. K. (2006). Advancing candidate link generation for requirements tracing: The study of methods. IEEE Transactions on Software Engineering, 32(1), 4–19.

    Article  Google Scholar 

  20. Hindle, A., German, D. M., & Holt, R. (2008). What do large commits tell us? A taxonomical study of large commits. In Proceedings of 2008 international working conference on mining software repositories (MSR’08) (pp. 99–108).

  21. Kagdi, H., Collard, M. L., & Maletic, J. I. (2007). A survey and taxonomy of approaches for mining software repositories in the context of software evolution. Journal of Software Maintenance and Evolution Research and Practice (JSME), 19(2), 77–131.

    Article  Google Scholar 

  22. Kim, M., & Notkin, D. (2006). Program element matching for multiversion program analyses. In Proceedings of 2006 international workshop on mining software repositories (MSR’06), Shanghai, China (pp. 58–64).

  23. Kim, M., Notkin, D., & Grossman, D. (2007). Automatic inference of structural changes for matching across program versions. In Proceedings of 29th international conference on software engineering (ICSE’07), Minneapolis, MN (pp. 333–343).

  24. Maletic, J. I., & Collard, M. L. (2004). Supporting source code difference analysis. In Proceedings of IEEE international conference on software maintenance (ICSM’04) (pp. 210–219). Chicago, Illinois: IEEE Computer Society Press.

  25. Marcus, A., & Maletic, J. I. (2003). Recovering documentation-to-source-code traceability links using latent semantic indexing. In Proceedings of 25th IEEE/ACM international conference on software engineering (ICSE 2003), Portland, OR (pp. 124–135).

  26. Mockus, A., & Votta, L. G. (2000). Identifying reasons for software changes using historic databases. In Proceedings of 16th IEEE international conference on software maintenance (ICSM’00) (p. 120).

  27. Murphy, G. C., Notkin, D., & Sullivan, K. J. (2001). Software reflexion models: Bridging the gap between design and implementation. IEEE Transactions on Software Engineering, 27(4), 364–380.

    Article  Google Scholar 

  28. Nguyen, T. N. (2006). A novel structure-oriented difference approach for software artifacts. In Proceedings of 30th annual international computer software and applications conference (COMPSAC’06) (pp. 197–204).

  29. Nguyen, T. N., Thao, C., & Munson, E. V. (2005). On product versioning for hypertexts. In Proceedings of 12th international workshop on software configuration management (SCM’05), Lisbon, Portugal (pp. 113–132).

  30. Nistor, E. C., Erenkrantz, J. R., Hendrickson, S. A., & Hoek, A. v. d. (2005). ArchEvol: Versioning architectural-implementation relationships. In Proceedings of 12th international workshop on software configuration management (SCM’05), Lisbon, Portugal (pp. 99–111).

  31. Pan, K., Kim, S., & James Whitehead, J. E. (2009). Toward an understanding of bug fix patterns. Empirical Software Engineering, 14(3), 286–315.

    Article  Google Scholar 

  32. Purushothaman, R., & Perry, D. E. (2005). Toward understanding the rhetoric of small source code changes. IEEE Transactions on Software Engineering, 31(6), 511–526.

    Article  Google Scholar 

  33. Raghavan, S., Rohana, R., Leon, D., Podgurski, A., & Augustine, V. (2004). Dex: A semantic-graph differencing tool for studying changes in large code bases. In Proceedings of 20th IEEE international conference on software maintenance (ICSM’04), Chicago, Illinois (pp. 188–197).

  34. Ratzinger, J., Sigmund, T., & Gall, H. C. (2008). On the relation of refactoring and software defects. In Proceedings of 2008 international working conference on mining software repositories (pp. 35–38).

  35. Ratzinger, J., Sigmund, T., Vorburger, P., & Gall, H. (2007). Mining software evolution to predict refactoring. In Proceedings of first international symposium on empirical software engineering and measurement (ESEM’07) (pp. 354–363).

  36. Reiss, S. (2002). Constraining software evolution. In Proceedings of 18th IEEE international conference on software maintenance (ICSM’02), Montréal, Canada (pp. 162–171).

  37. Reiss, S. (2005). Incremental maintenance of software artifacts. In Proceedings of 21st IEEE international conference on software maintenance (ICSM’05), Hungary (pp. 113–122).

  38. Sefika, M., Sane, A., & Campbell, R. H. (1992). Monitoring compliance of a software system with its high-level design models. In Proceedings of 18th international conference on software engineering (ICSE’92), Berlin, Germany (pp. 387–396).

  39. Weißgerber, P., & Diehl, S. (2006). Identifying refactorings from source-code changes. In Proceedings of 21st IEEE/ACM international conference onautomated software engineering (ASE’06), Japan (pp. 231–240).

  40. Xing, Z., & Stroulia, E. (2004a). Understanding class evolution in object-oriented software. In Proceedings of 12th international workshop on program comprehension (ICPC’04), Bari, Italy (pp. 34–43).

  41. Xing, Z., & Stroulia, E. (2004b). Understanding class evolution in object-oriented software. In Proceedings of 12th IEEE international workshop on program comprehension (IWPC’04) (pp. 34–43).

  42. Xing, Z., & Stroulia, E. (2005). UMLDiff: An algorithm for object-oriented design differencing. In Proceedings of 20th IEEE/ACM international conference on automated software engineering (ASE’05), Long Beach, CA, USA (pp. 54–65).

  43. Xing, Z., & Stroulia, E. (2007). API-evolution support with diff-catchup. IEEE Transactions on Software Engineering, 33(12), 818–836.

    Article  Google Scholar 

  44. Zhao, W., Zhang, L., Liu, Y., Luo, J., & Sun, J. (2003). Understanding how the requirements are implemented in source code. In Proceedings of 10th Asia-Pacific software engineering conference (APSEC’03) (pp. 68–77).

  45. Zhou, X., & Yu, H. (2007). A clustering-based approach for tracing object-oriented design to requirement. In Proceedings of 10th international conference on fundamental approaches to software engineering (FASE’07), Portugal (pp. 412–422).

Download references

Acknowledgments

This research is funded in part by the U.S. National Science Foundation under NSF grant CCF 08-11140.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Maen Hammad.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hammad, M., Collard, M.L. & Maletic, J.I. Automatically identifying changes that impact code-to-design traceability during evolution. Software Qual J 19, 35–64 (2011). https://doi.org/10.1007/s11219-010-9103-x

Download citation

Keywords

  • Software evolution
  • Design change
  • Software traceability
  • Commit analysis