RePOR: Mimicking humans on refactoring tasks. Are we there yet?

Morales, Rodrigo; Khomh, Foutse; Antoniol, Giuliano

doi:10.1007/s10664-020-09826-7

RePOR: Mimicking humans on refactoring tasks. Are we there yet?

Published: 07 June 2020

Volume 25, pages 2960–2996, (2020)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

381 Accesses
2 Citations
4 Altmetric
Explore all metrics

Abstract

Refactoring is a maintenance activity that aims to improve design quality while preserving the behavior of a system. Several (semi)automated approaches have been proposed to support developers in this maintenance activity, based on the correction of anti-patterns, which are “poor” solutions to recurring design problems. However, little quantitative evidence exists about the impact of automatically refactored code on program comprehension, and in which context automated refactoring can be as effective as manual refactoring. Leveraging RePOR, an automated refactoring approach based on partial order reduction techniques, we performed an empirical study to investigate whether automated refactoring code structure affects the understandability of systems during comprehension tasks. (1) We surveyed 80 developers, asking them to identify from a set of 20 refactoring changes if they were generated by developers or by a tool, and to rate the refactoring changes according to their design quality; (2) we asked 30 developers to complete code comprehension tasks on 10 systems that were refactored by either a freelancer or an automated refactoring tool. To make comparison fair, for a subset of refactoring actions that introduce new code entities, only synthetic identifiers were presented to practitioners. We measured developers’ performance using the NASA task load index for their effort, the time that they spent performing the tasks, and their percentages of correct answers. Our findings, despite current technology limitations, show that it is reasonable to expect a refactoring tools to match developer code. Indeed, results show that for 3 out of the 5 anti-pattern types studied, developers could not recognize the origin of the refactoring (i.e., whether it was performed by a human or an automatic tool). We also observed that developers do not prefer human refactorings over automated refactorings, except when refactoring Blob classes; and that there is no statistically significant difference between the impact on code understandability of human refactorings and automated refactorings. We conclude that automated refactorings can be as effective as manual refactorings. However, for complex anti-patterns types like the Blob, the perceived quality achieved by developers is slightly higher.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

Article Open access 22 May 2023

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Abbes M, Khomh F, Gueheneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In: 2011 15th european conf. on Software maintenance and reengineering (CSMR), pp 181–190
Arima R, Higo Y, Kusomoto S (2018) Toward refactoring evaluation with code naturalness. In: Proceedings of the 26th International Conference on Program Comprehension, ICPC ’18. IEEE Press. https://doi.org/10.1145/3196321.3196362
Beck K, Andres C (2004) Extreme programming explained: Embrace change, 2nd edon. Addison-Wesley Professional
Bois BD, Demeyer S, Verelst J, Mens T, Temmerman M (2006) Does god class decomposition affect comprehensibility?. In: IASTED Conf. on software engineering
Brown WJ, Malveau RC, Brown WH, Mccormick III HW, Mowbray TJ (1998) Anti Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. Wiley, New York
Cliff N (2014) Ordinal methods for behavioral data analysis. Psychology Press
Deligiannis I, Stamelos I, Angelis L, Roumeliotis M, Shepperd M (2004) A controlled experiment investigation of an object-oriented design heuristic for maintainability. J Syst Softw 72(2):129–143. https://doi.org/10.1016/S0164-1212(03)00240-1
Article Google Scholar
Fowler M (1999) Refactoring: improving the design of existing code. Pearson Education, India
Fraser G, Arcuri A (2014) A large scale evaluation of automated unit test generation using evosuite. ACM Trans Softw Eng Methodol (TOSEM) 24(2):8
Article Google Scholar
Griswold WG, Notkin D (1993) Automated assistance for program restructuring. ACM Trans Softw Eng Methodol (TOSEM) 2(3):228–269
Article Google Scholar
Hassan AE (2009) Predicting faults using the complexity of code changes. In: 2009 IEEE 31St international conference on software engineering, pp 78–88. https://doi.org/10.1109/ICSE.2009.5070510
Hu X, Li G, Xia X, Lo D, Jin Z (2018) Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension. ACM, pp 200–210
Kataoka Y, Imai T, Andou H, Fukaya T (2002) A quantitative evaluation of maintainability enhancement by refactoring. In: 2002. Proceedings. International conference on Software maintenance. IEEE, pp 576–585
Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoringchallenges and benefits at microsoft. IEEE Trans Softw Eng 40(7):633–649. https://doi.org/10.1109/TSE.2014.2318734
Article Google Scholar
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34th international conference on Software engineering (ICSE). IEEE, pp 3–13
Marinescu R (2004) Detection strategies: Metrics-based rules for detecting design flaws. In: IEEE Int’l conference on software maintenance, ICSM. IEEE Computer Society, pp 350–359
Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Software Engineering 30(2):126–139
Article Google Scholar
Moghadam IH, Cinneide MO (2012) Automated refactoring using design differencing. In: Software Maintenance and Reengineering (CSMR), 16th European Conference on, Proceedings of the European Conference on Software Maintenance and Reengineering, CSMR. IEEE Computer Society, pp 43–52. https://doi.org/10.1109/CSMR.2012.15
Moha N, Gueheneuc YG, Duchien L, Le Meur A (2010) Decor: A method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36
Article Google Scholar
Morales R, Sabane A, Musavi P, Khomh F, Chicano F, Antoniol G (2016) Finding the best compromise between design quality and testing effort during refactoring. In: 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), vol 1, pp 24–35
Morales R, Soh Z, Khomh F, Antoniol G, Chicano F (2017) On the use of developers’ context for automatic refactoring of software anti-patterns. J Syst Softw 128:236–251
Article Google Scholar
Morales R, Chicano F, Khomh F, Antoniol G (2018) Efficient refactoring scheduling based on partial order reduction. Journal of Systems and Software. https://doi.org/10.1016/j.jss.2018.07.076
Morales R, Khomh F, Antoniol G (2019) Repor: Mimiking humans on refactoring tasks. replication website. https://moar82.github.io/emserefturing/
Moser R, Abrahamsson P, Pedrycz W, Sillitti A, Succi G (2008) A case study on the impact of refactoring on quality and productivity in an agile team. In: Balancing agility and formalism in software engineering. Springer, pp 252–266
Murphy-Hill E, Black A (2008) Refactoring tools: Fitness for purpose. IEEE Softw 25(5):38–44
Article Google Scholar
Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18. https://doi.org/10.1109/TSE.2011.41
Article Google Scholar
Negara S, Chen N, Vakilian M, Johnson RE, Dig D (2013) A comparative study of manual and automated refactorings. In: Castagna G (ed) ECOOP 2013 – Object-oriented programming. Springer, Berlin, pp 552–576
Opdyke WF (1992) Refactoring object-oriented frameworks. Ph.D. thesis., University of Illinois at Urbana-Champaign
Ouni A, Kessentini M, Sahraoui H, Inoue K, Hamdi MS (2015) Improving multi-objective code-smells correction using development history. J Syst Softw 105 (0):18–39
Article Google Scholar
Palomba F, Bavota G, Penta MD, Oliveto R, Lucia AD (2014) Do they really smell bad? a study on developers’ perception of bad code smells. In: 2014 IEEE int’l conference on Software maintenance and evolution (ICSME). IEEE, pp 101–110
Parnas DL (1994) Software aging. In: ICSE ’94: Proc. Of the 16th int’l conference on software engineering. IEEE Computer Society Press, pp 279–287
Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the nsse and other surveys: Are the t-test and cohen’sd indices the most appropriate choices. In: Annual meeting of the southern association for institutional research
Seng O, Stammel J, Burkhart D (2006) Search-based determination of refactorings for improving the class structure of object-oriented systems. GECCO 2006: Genetic Evol Comput Conf 1 & 2:1909–1916
Google Scholar
Sharek D (2011) A useable, online nasa-tlx tool. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol 55. SAGE Publications, Sage, pp 1375–1379
Google Scholar
Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press
Sillito J, Murphy GC, De Volder K (2008) Asking and answering questions during a programming change task. IEEE Trans Softw Eng 34(4):434–451
Article Google Scholar
Szőke G, Nagy C, Hegedűs P, Ferenc R, Gyimőthy T (2015) Do automatic refactorings improve maintainability? an industrial case study. In: 2015 IEEE International conference on software maintenance and evolution (ICSME), pp 429–438. https://doi.org/10.1109/ICSM.2015.7332494
Turing AM (2009) Computing machinery and intelligence. In: Parsing the turing test. Springer, pp 23–65
Vakilian M, Chen N, Negara S, Rajkumar BA, Bailey BP, Johnson RE (2012) Use, disuse, and misuse of automated refactorings. In: 2012 34th international conference on Software engineering (ICSE). IEEE, pp 233–243
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer Science & Business Media
Xing Z, Stroulia E (2006) Refactoring practice: How it is and how it should be supported-an eclipse case study. In: 2006. ICSM’06. 22nd IEEE international conference on Software maintenance. IEEE, pp 458–468

Download references

Author information

Authors and Affiliations

Department of Computer Science and Software Engineering, Concordia University, Montréal, Canada
Rodrigo Morales
Département de génie informatique et génie logiciel, École Polytechnique de Montréal, Montréal, Canada
Foutse Khomh & Giuliano Antoniol

Authors

Rodrigo Morales
View author publications
You can also search for this author in PubMed Google Scholar
Foutse Khomh
View author publications
You can also search for this author in PubMed Google Scholar
Giuliano Antoniol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Morales.

Additional information

Communicated by: Vittorio Cortellessa

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morales, R., Khomh, F. & Antoniol, G. RePOR: Mimicking humans on refactoring tasks. Are we there yet?. Empir Software Eng 25, 2960–2996 (2020). https://doi.org/10.1007/s10664-020-09826-7

Download citation

Published: 07 June 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s10664-020-09826-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RePOR: Mimicking humans on refactoring tasks. Are we there yet?

Abstract

Access this article

Similar content being viewed by others

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RePOR: Mimicking humans on refactoring tasks. Are we there yet?

Abstract

Access this article

Similar content being viewed by others

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation